Timezone: »
We study the efficiency of Thompson sampling for contextual bandits. Existing Thompson sampling-based algorithms need to construct a Laplace approximation (i.e., a Gaussian distribution) of the posterior distribution, which is inefficient to sample in high dimensional applications for general covariance matrices. Moreover, the Gaussian approximation may not be a good surrogate for the posterior distribution for general reward generating functions. We propose an efficient posterior sampling algorithm, viz., Langevin Monte Carlo Thompson Sampling (LMC-TS), that uses Markov Chain Monte Carlo (MCMC) methods to directly sample from the posterior distribution in contextual bandits. Our method is computationally efficient since it only needs to perform noisy gradient descent updates without constructing the Laplace approximation of the posterior distribution. We prove that the proposed algorithm achieves the same sublinear regret bound as the best Thompson sampling algorithms for a special case of contextual bandits, viz., linear contextual bandits. We conduct experiments on both synthetic data and real-world datasets on different contextual bandit models, which demonstrates that directly sampling from the posterior is both computationally efficient and competitive in performance.
Author Information
Pan Xu (California Institute of Technology)
Hongkai Zheng (Caltech)
Eric Mazumdar (Caltech)
Kamyar Azizzadenesheli (Purdue University)
Animashree Anandkumar (Caltech and NVIDIA)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Poster: Langevin Monte Carlo for Contextual Bandits »
Tue. Jul 19th through Wed the 20th Room Hall E #818
More from the Same Authors
-
2022 : Physics-Informed Neural Operator for Learning Partial Differential Equations »
Zongyi Li · Hongkai Zheng · Nikola Kovachki · David Jin · Haoxuan Chen · Burigede Liu · Kamyar Azizzadenesheli · Animashree Anandkumar -
2023 Poster: Fast Sampling of Diffusion Models via Operator Learning »
Hongkai Zheng · Weili Nie · Arash Vahdat · Kamyar Azizzadenesheli · Anima Anandkumar -
2023 Poster: Algorithmic Collective Action in Machine Learning »
Moritz Hardt · Eric Mazumdar · Celestine Mendler-Dünner · Tijana Zrnic -
2023 Workshop: New Frontiers in Learning, Control, and Dynamical Systems »
Valentin De Bortoli · Charlotte Bunne · Guan-Horng Liu · Tianrong Chen · Maxim Raginsky · Pratik Chaudhari · Melanie Zeilinger · Animashree Anandkumar -
2022 Poster: Diffusion Models for Adversarial Purification »
Weili Nie · Brandon Guo · Yujia Huang · Chaowei Xiao · Arash Vahdat · Animashree Anandkumar -
2022 Spotlight: Diffusion Models for Adversarial Purification »
Weili Nie · Brandon Guo · Yujia Huang · Chaowei Xiao · Arash Vahdat · Animashree Anandkumar -
2022 Poster: Supervised Learning with General Risk Functionals »
Liu Leqi · Audrey Huang · Zachary Lipton · Kamyar Azizzadenesheli -
2022 Spotlight: Supervised Learning with General Risk Functionals »
Liu Leqi · Audrey Huang · Zachary Lipton · Kamyar Azizzadenesheli -
2022 Poster: Understanding The Robustness in Vision Transformers »
Zhou Daquan · Zhiding Yu · Enze Xie · Chaowei Xiao · Animashree Anandkumar · Jiashi Feng · Jose M. Alvarez -
2022 Spotlight: Understanding The Robustness in Vision Transformers »
Zhou Daquan · Zhiding Yu · Enze Xie · Chaowei Xiao · Animashree Anandkumar · Jiashi Feng · Jose M. Alvarez -
2021 : Invited Speaker: Animashree Anandkumar: Stability-aware reinforcement learning in dynamical systems »
Animashree Anandkumar -
2021 Workshop: Workshop on Socially Responsible Machine Learning »
Chaowei Xiao · Animashree Anandkumar · Mingyan Liu · Dawn Song · Raquel Urtasun · Jieyu Zhao · Xueru Zhang · Cihang Xie · Xinyun Chen · Bo Li -
2020 : Q&A: Anima Anandakumar »
Animashree Anandkumar · Jessica Forde -
2020 : Invited Talks: Anima Anandakumar »
Animashree Anandkumar -
2020 : Mentoring Panel: Doina Precup, Deborah Raji, Anima Anandkumar, Angjoo Kanazawa and Sinead Williamson (moderator). »
Doina Precup · Inioluwa Raji · Angjoo Kanazawa · Sinead A Williamson · Animashree Anandkumar -
2018 Poster: StrassenNets: Deep Learning with a Multiplication Budget »
Michael Tschannen · Aran Khanna · Animashree Anandkumar -
2018 Oral: StrassenNets: Deep Learning with a Multiplication Budget »
Michael Tschannen · Aran Khanna · Animashree Anandkumar