Timezone: »
Thompson sampling has impressive empirical performance for many multi-armed bandit problems. But current algorithms for Thompson sampling only work for the case of conjugate priors since they require to perform online Bayesian posterior inference, which is a difficult task when the prior is not conjugate. In this paper, we propose a novel algorithm for Thompson sampling which only requires to draw samples from a tractable proposal distribution. So our algorithm is efficient even when the prior is non-conjugate. To do this, we reformulate Thompson sampling as an optimization proplem via the Gumbel-Max trick. After that we construct a set of random variables and our goal is to identify the one with highest mean which is an instance of best arm identification problems. Finally, we solve it with techniques in best arm identification. Experiments show that our algorithm works well in practice.
Author Information
Yichi Zhou (Tsinghua University)
Jun Zhu (Tsinghua University)
Jingwei Zhuo (Tsinghua University)
Related Events (a corresponding poster, oral, or spotlight)
-
2018 Oral: Racing Thompson: an Efficient Algorithm for Thompson Sampling with Non-conjugate Priors »
Fri. Jul 13th 09:50 -- 10:00 AM Room A5
More from the Same Authors
-
2021 : Towards Safe Reinforcement Learning via Constraining Conditional Value at Risk »
Chengyang Ying · Xinning Zhou · Dong Yan · Jun Zhu -
2021 : Strategically-timed State-Observation Attacks on Deep Reinforcement Learning Agents »
Xinning Zhou · You Qiaoben · Chengyang Ying · Jun Zhu -
2021 : Adversarial Semantic Contour for Object Detection »
Yichi Zhang · Zijian Zhu · Xiao Yang · Jun Zhu -
2021 : Query-based Adversarial Attacks on Graph with Fake Nodes »
Zhengyi Wang · Zhongkai Hao · Jun Zhu -
2023 Poster: MultiAdam: Parameter-wise Scale-invariant Optimizer for Multiscale Training of Physics-informed Neural Networks »
Jiachen Yao · Chang Su · Zhongkai Hao · LIU SONGMING · Hang Su · Jun Zhu -
2023 Poster: Contrastive Energy Prediction for Exact Energy-Guided Diffusion Sampling in Offline Reinforcement Learning »
Cheng Lu · Huayu Chen · Jianfei Chen · Hang Su · Chongxuan Li · Jun Zhu -
2023 Poster: Stabilizing GANs' Training with Brownian Motion Controller »
Tianjiao Luo · Ziyu Zhu · Jianfei Chen · Jun Zhu -
2023 Poster: Revisiting Discriminative vs. Generative Classifiers: Theory and Implications »
Chenyu Zheng · Guoqiang Wu · Fan Bao · Yue Cao · Chongxuan Li · Jun Zhu -
2023 Poster: NUNO: A General Framework for Learning Parametric PDEs with Non-Uniform Data »
LIU SONGMING · Zhongkai Hao · Chengyang Ying · Hang Su · Ze Cheng · Jun Zhu -
2023 Poster: Improved Techniques for Maximum Likelihood Estimation for Diffusion ODEs »
Kaiwen Zheng · Cheng Lu · Jianfei Chen · Jun Zhu -
2023 Poster: One Transformer Fits All Distributions in Multi-Modal Diffusion at Scale »
Fan Bao · Shen Nie · Kaiwen Xue · Chongxuan Li · Shi Pu · Yaole Wang · Gang Yue · Yue Cao · Hang Su · Jun Zhu -
2023 Poster: GNOT: A General Neural Operator Transformer for Operator Learning »
Zhongkai Hao · Zhengyi Wang · Hang Su · Chengyang Ying · Yinpeng Dong · LIU SONGMING · Ze Cheng · Jian Song · Jun Zhu -
2022 Poster: NeuralEF: Deconstructing Kernels by Deep Neural Networks »
Zhijie Deng · Jiaxin Shi · Jun Zhu -
2022 Spotlight: NeuralEF: Deconstructing Kernels by Deep Neural Networks »
Zhijie Deng · Jiaxin Shi · Jun Zhu -
2022 Poster: Robustness and Accuracy Could Be Reconcilable by (Proper) Definition »
Tianyu Pang · Min Lin · Xiao Yang · Jun Zhu · Shuicheng Yan -
2022 Poster: Fast Lossless Neural Compression with Integer-Only Discrete Flows »
Siyu Wang · Jianfei Chen · Chongxuan Li · Jun Zhu · Bo Zhang -
2022 Spotlight: Fast Lossless Neural Compression with Integer-Only Discrete Flows »
Siyu Wang · Jianfei Chen · Chongxuan Li · Jun Zhu · Bo Zhang -
2022 Spotlight: Robustness and Accuracy Could Be Reconcilable by (Proper) Definition »
Tianyu Pang · Min Lin · Xiao Yang · Jun Zhu · Shuicheng Yan -
2022 Poster: Thompson Sampling for (Combinatorial) Pure Exploration »
Siwei Wang · Jun Zhu -
2022 Spotlight: Thompson Sampling for (Combinatorial) Pure Exploration »
Siwei Wang · Jun Zhu -
2021 Poster: Variational (Gradient) Estimate of the Score Function in Energy-based Latent Variable Models »
Fan Bao · Kun Xu · Chongxuan Li · Lanqing Hong · Jun Zhu · Bo Zhang -
2021 Spotlight: Variational (Gradient) Estimate of the Score Function in Energy-based Latent Variable Models »
Fan Bao · Kun Xu · Chongxuan Li · Lanqing Hong · Jun Zhu · Bo Zhang -
2020 Poster: Understanding and Stabilizing GANs' Training Dynamics Using Control Theory »
Kun Xu · Chongxuan Li · Jun Zhu · Bo Zhang -
2020 Poster: Variance Reduction and Quasi-Newton for Particle-Based Variational Inference »
Michael Zhu · Chang Liu · Jun Zhu -
2020 Poster: Learning Optimal Tree Models under Beam Search »
Jingwei Zhuo · Ziru Xu · Wei Dai · Han Zhu · HAN LI · Jian Xu · Kun Gai -
2020 Poster: VFlow: More Expressive Generative Flows with Variational Data Augmentation »
Jianfei Chen · Cheng Lu · Biqi Chenli · Jun Zhu · Tian Tian -
2020 Poster: Nonparametric Score Estimators »
Yuhao Zhou · Jiaxin Shi · Jun Zhu -
2019 Poster: Understanding and Accelerating Particle-Based Variational Inference »
Chang Liu · Jingwei Zhuo · Pengyu Cheng · RUIYI (ROY) ZHANG · Jun Zhu -
2019 Oral: Understanding and Accelerating Particle-Based Variational Inference »
Chang Liu · Jingwei Zhuo · Pengyu Cheng · RUIYI (ROY) ZHANG · Jun Zhu -
2019 Poster: Improving Adversarial Robustness via Promoting Ensemble Diversity »
Tianyu Pang · Kun Xu · Chao Du · Ning Chen · Jun Zhu -
2019 Poster: Understanding MCMC Dynamics as Flows on the Wasserstein Space »
Chang Liu · Jingwei Zhuo · Jun Zhu -
2019 Oral: Understanding MCMC Dynamics as Flows on the Wasserstein Space »
Chang Liu · Jingwei Zhuo · Jun Zhu -
2019 Oral: Improving Adversarial Robustness via Promoting Ensemble Diversity »
Tianyu Pang · Kun Xu · Chao Du · Ning Chen · Jun Zhu -
2018 Poster: Message Passing Stein Variational Gradient Descent »
Jingwei Zhuo · Chang Liu · Jiaxin Shi · Jun Zhu · Ning Chen · Bo Zhang -
2018 Oral: Message Passing Stein Variational Gradient Descent »
Jingwei Zhuo · Chang Liu · Jiaxin Shi · Jun Zhu · Ning Chen · Bo Zhang -
2018 Poster: Max-Mahalanobis Linear Discriminant Analysis Networks »
Tianyu Pang · Chao Du · Jun Zhu -
2018 Poster: Adversarial Attack on Graph Structured Data »
Hanjun Dai · Hui Li · Tian Tian · Xin Huang · Lin Wang · Jun Zhu · Le Song -
2018 Oral: Max-Mahalanobis Linear Discriminant Analysis Networks »
Tianyu Pang · Chao Du · Jun Zhu -
2018 Oral: Adversarial Attack on Graph Structured Data »
Hanjun Dai · Hui Li · Tian Tian · Xin Huang · Lin Wang · Jun Zhu · Le Song -
2018 Poster: Stochastic Training of Graph Convolutional Networks with Variance Reduction »
Jianfei Chen · Jun Zhu · Le Song -
2018 Poster: A Spectral Approach to Gradient Estimation for Implicit Distributions »
Jiaxin Shi · Shengyang Sun · Jun Zhu -
2018 Oral: A Spectral Approach to Gradient Estimation for Implicit Distributions »
Jiaxin Shi · Shengyang Sun · Jun Zhu -
2018 Oral: Stochastic Training of Graph Convolutional Networks with Variance Reduction »
Jianfei Chen · Jun Zhu · Le Song -
2017 Poster: Identify the Nash Equilibrium in Static Games with Random Payoffs »
Yichi Zhou · Jialian Li · Jun Zhu -
2017 Talk: Identify the Nash Equilibrium in Static Games with Random Payoffs »
Yichi Zhou · Jialian Li · Jun Zhu