Timezone: »
Poster
Near-Optimal Representation Learning for Linear Bandits and Linear RL
Jiachen Hu · Xiaoyu Chen · Chi Jin · Lihong Li · Liwei Wang
This paper studies representation learning for multi-task linear bandits and multi-task episodic RL with linear value function approximation. We first consider the setting where we play $M$ linear bandits with dimension $d$ concurrently, and these bandits share a common $k$-dimensional linear representation so that $k\ll d$ and $k \ll M$. We propose a sample-efficient algorithm, MTLR-OFUL, which leverages the shared representation to achieve $\tilde{O}(M\sqrt{dkT} + d\sqrt{kMT} )$ regret, with $T$ being the number of total steps. Our regret significantly improves upon the baseline $\tilde{O}(Md\sqrt{T})$ achieved by solving each task independently. We further develop a lower bound that shows our regret is near-optimal when $d > M$. Furthermore, we extend the algorithm and analysis to multi-task episodic RL with linear value function approximation under low inherent Bellman error (Zanette et al., 2020a). To the best of our knowledge, this is the first theoretical result that characterize the benefits of multi-task representation learning for exploration in RL with function approximation.
Author Information
Jiachen Hu (Peking University)
Xiaoyu Chen (Peking University)
Chi Jin (Princeton University)
Lihong Li (Amazon)
Liwei Wang (Peking University)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Spotlight: Near-Optimal Representation Learning for Linear Bandits and Linear RL »
Thu. Jul 22nd 12:30 -- 12:35 AM Room
More from the Same Authors
-
2021 : The Power of Exploiter: Provable Multi-Agent RL in Large State Spaces »
Chi Jin · Qinghua Liu · Tiancheng Yu -
2021 : Bellman Eluder Dimension: New Rich Classes of RL Problems, and Sample-Efficient Algorithms »
Chi Jin · Qinghua Liu · Sobhan Miryoosefi -
2021 : Sample-Efficient Learning of Stackelberg Equilibria in General-Sum Games »
Yu Bai · Chi Jin · Huan Wang · Caiming Xiong -
2023 : Is RLHF More Difficult than Standard RL? »
Chi Jin -
2023 Poster: Efficient displacement convex optimization with particle gradient descent »
Hadi Daneshmand · Jason Lee · Chi Jin -
2023 Poster: On the Power of Pre-training for Generalization in RL: Provable Benefits and Hardness »
Haotian Ye · Xiaoyu Chen · Liwei Wang · Simon Du -
2023 Poster: A Complete Expressiveness Hierarchy for Subgraph GNNs via Subgraph Weisfeiler-Lehman Tests »
Bohang Zhang · Guhao Feng · Yiheng Du · Di He · Liwei Wang -
2023 Oral: On the Power of Pre-training for Generalization in RL: Provable Benefits and Hardness »
Haotian Ye · Xiaoyu Chen · Liwei Wang · Simon Du -
2023 Poster: Offline Meta Reinforcement Learning with In-Distribution Online Adaptation »
Jianhao Wang · Jin Zhang · Haozhe Jiang · Junyu Zhang · Liwei Wang · Chongjie Zhang -
2022 Poster: A Simple Reward-free Approach to Constrained Reinforcement Learning »
Sobhan Miryoosefi · Chi Jin -
2022 Poster: Pessimistic Minimax Value Iteration: Provably Efficient Equilibrium Learning from Offline Datasets »
Han Zhong · Wei Xiong · Jiyuan Tan · Liwei Wang · Tong Zhang · Zhaoran Wang · Zhuoran Yang -
2022 Spotlight: A Simple Reward-free Approach to Constrained Reinforcement Learning »
Sobhan Miryoosefi · Chi Jin -
2022 Spotlight: Pessimistic Minimax Value Iteration: Provably Efficient Equilibrium Learning from Offline Datasets »
Han Zhong · Wei Xiong · Jiyuan Tan · Liwei Wang · Tong Zhang · Zhaoran Wang · Zhuoran Yang -
2022 Poster: The Power of Exploiter: Provable Multi-Agent RL in Large State Spaces »
Chi Jin · Qinghua Liu · Tiancheng Yu -
2022 Poster: Learning Markov Games with Adversarial Opponents: Efficient Algorithms and Fundamental Limits »
Qinghua Liu · Yuanhao Wang · Chi Jin -
2022 Poster: Near-Optimal Learning of Extensive-Form Games with Imperfect Information »
Yu Bai · Chi Jin · Song Mei · Tiancheng Yu -
2022 Spotlight: Near-Optimal Learning of Extensive-Form Games with Imperfect Information »
Yu Bai · Chi Jin · Song Mei · Tiancheng Yu -
2022 Oral: Learning Markov Games with Adversarial Opponents: Efficient Algorithms and Fundamental Limits »
Qinghua Liu · Yuanhao Wang · Chi Jin -
2022 Spotlight: The Power of Exploiter: Provable Multi-Agent RL in Large State Spaces »
Chi Jin · Qinghua Liu · Tiancheng Yu -
2022 Poster: Nearly Optimal Policy Optimization with Stable at Any Time Guarantee »
Tianhao Wu · Yunchang Yang · Han Zhong · Liwei Wang · Simon Du · Jiantao Jiao -
2022 Poster: Provable Reinforcement Learning with a Short-Term Memory »
Yonathan Efroni · Chi Jin · Akshay Krishnamurthy · Sobhan Miryoosefi -
2022 Poster: Human-in-the-loop: Provably Efficient Preference-based Reinforcement Learning with General Function Approximation »
Xiaoyu Chen · Han Zhong · Zhuoran Yang · Zhaoran Wang · Liwei Wang -
2022 Spotlight: Nearly Optimal Policy Optimization with Stable at Any Time Guarantee »
Tianhao Wu · Yunchang Yang · Han Zhong · Liwei Wang · Simon Du · Jiantao Jiao -
2022 Spotlight: Human-in-the-loop: Provably Efficient Preference-based Reinforcement Learning with General Function Approximation »
Xiaoyu Chen · Han Zhong · Zhuoran Yang · Zhaoran Wang · Liwei Wang -
2022 Spotlight: Provable Reinforcement Learning with a Short-Term Memory »
Yonathan Efroni · Chi Jin · Akshay Krishnamurthy · Sobhan Miryoosefi -
2021 : Sample-Efficient Learning of Stackelberg Equilibria in General-Sum Games »
Yu Bai · Chi Jin · Huan Wang · Caiming Xiong -
2021 : Discussion Panel #1 »
Hang Su · Matthias Hein · Liwei Wang · Sven Gowal · Jan Hendrik Metzen · Henry Liu · Yisen Wang -
2021 : Invited Talk #1 »
Liwei Wang -
2021 : RL + Recommender Systems Panel »
Alekh Agarwal · Ed Chi · Maria Dimakopoulou · Georgios Theocharous · Minmin Chen · Lihong Li -
2021 : RL Foundation Panel »
Matthew Botvinick · Thomas Dietterich · Leslie Kaelbling · John Langford · Warrren B Powell · Csaba Szepesvari · Lihong Li · Yuxi Li -
2021 Workshop: Reinforcement Learning for Real Life »
Yuxi Li · Minmin Chen · Omer Gottesman · Lihong Li · Zongqing Lu · Rupam Mahmood · Niranjani Prasad · Zhiwei (Tony) Qin · Csaba Szepesvari · Matthew Taylor -
2021 Poster: Towards Certifying L-infinity Robustness using Neural Networks with L-inf-dist Neurons »
Bohang Zhang · Tianle Cai · Zhou Lu · Di He · Liwei Wang -
2021 Spotlight: Towards Certifying L-infinity Robustness using Neural Networks with L-inf-dist Neurons »
Bohang Zhang · Tianle Cai · Zhou Lu · Di He · Liwei Wang -
2021 Poster: A Sharp Analysis of Model-based Reinforcement Learning with Self-Play »
Qinghua Liu · Tiancheng Yu · Yu Bai · Chi Jin -
2021 Poster: Provable Meta-Learning of Linear Representations »
Nilesh Tripuraneni · Chi Jin · Michael Jordan -
2021 Spotlight: Provable Meta-Learning of Linear Representations »
Nilesh Tripuraneni · Chi Jin · Michael Jordan -
2021 Spotlight: A Sharp Analysis of Model-based Reinforcement Learning with Self-Play »
Qinghua Liu · Tiancheng Yu · Yu Bai · Chi Jin -
2021 Poster: On Reinforcement Learning with Adversarial Corruption and Its Application to Block MDP »
Tianhao Wu · Yunchang Yang · Simon Du · Liwei Wang -
2021 Poster: Risk Bounds and Rademacher Complexity in Batch Reinforcement Learning »
Yaqi Duan · Chi Jin · Zhiyuan Li -
2021 Spotlight: On Reinforcement Learning with Adversarial Corruption and Its Application to Block MDP »
Tianhao Wu · Yunchang Yang · Simon Du · Liwei Wang -
2021 Spotlight: Risk Bounds and Rademacher Complexity in Batch Reinforcement Learning »
Yaqi Duan · Chi Jin · Zhiyuan Li -
2021 Poster: GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training »
Tianle Cai · Shengjie Luo · Keyulu Xu · Di He · Tie-Yan Liu · Liwei Wang -
2021 Spotlight: GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training »
Tianle Cai · Shengjie Luo · Keyulu Xu · Di He · Tie-Yan Liu · Liwei Wang -
2021 Poster: On the Optimality of Batch Policy Optimization Algorithms »
Chenjun Xiao · Yifan Wu · Jincheng Mei · Bo Dai · Tor Lattimore · Lihong Li · Csaba Szepesvari · Dale Schuurmans -
2021 Spotlight: On the Optimality of Batch Policy Optimization Algorithms »
Chenjun Xiao · Yifan Wu · Jincheng Mei · Bo Dai · Tor Lattimore · Lihong Li · Csaba Szepesvari · Dale Schuurmans -
2020 Poster: On Layer Normalization in the Transformer Architecture »
Ruibin Xiong · Yunchang Yang · Di He · Kai Zheng · Shuxin Zheng · Chen Xing · Huishuai Zhang · Yanyan Lan · Liwei Wang · Tie-Yan Liu -
2020 Poster: On Gradient Descent Ascent for Nonconvex-Concave Minimax Problems »
Tianyi Lin · Chi Jin · Michael Jordan -
2020 Poster: Reward-Free Exploration for Reinforcement Learning »
Chi Jin · Akshay Krishnamurthy · Max Simchowitz · Tiancheng Yu -
2020 Poster: Batch Stationary Distribution Estimation »
Junfeng Wen · Bo Dai · Lihong Li · Dale Schuurmans -
2020 Poster: Provable Self-Play Algorithms for Competitive Reinforcement Learning »
Yu Bai · Chi Jin -
2020 Poster: (Locally) Differentially Private Combinatorial Semi-Bandits »
Xiaoyu Chen · Kai Zheng · Zixin Zhou · Yunchang Yang · Wei Chen · Liwei Wang -
2020 Poster: Neural Contextual Bandits with UCB-based Exploration »
Dongruo Zhou · Lihong Li · Quanquan Gu -
2020 Poster: Learning Adversarial Markov Decision Processes with Bandit Feedback and Unknown Transition »
Chi Jin · Tiancheng Jin · Haipeng Luo · Suvrit Sra · Tiancheng Yu -
2020 Poster: Provably Efficient Exploration in Policy Optimization »
Qi Cai · Zhuoran Yang · Chi Jin · Zhaoran Wang -
2020 Poster: What is Local Optimality in Nonconvex-Nonconcave Minimax Optimization? »
Chi Jin · Praneeth Netrapalli · Michael Jordan -
2019 Workshop: Reinforcement Learning for Real Life »
Yuxi Li · Alborz Geramifard · Lihong Li · Csaba Szepesvari · Tao Wang -
2019 Poster: Efficient Training of BERT by Progressively Stacking »
Linyuan Gong · Di He · Zhuohan Li · Tao Qin · Liwei Wang · Tie-Yan Liu -
2019 Oral: Efficient Training of BERT by Progressively Stacking »
Linyuan Gong · Di He · Zhuohan Li · Tao Qin · Liwei Wang · Tie-Yan Liu -
2019 Poster: Policy Certificates: Towards Accountable Reinforcement Learning »
Christoph Dann · Lihong Li · Wei Wei · Emma Brunskill -
2019 Poster: Gradient Descent Finds Global Minima of Deep Neural Networks »
Simon Du · Jason Lee · Haochuan Li · Liwei Wang · Xiyu Zhai -
2019 Oral: Policy Certificates: Towards Accountable Reinforcement Learning »
Christoph Dann · Lihong Li · Wei Wei · Emma Brunskill -
2019 Oral: Gradient Descent Finds Global Minima of Deep Neural Networks »
Simon Du · Jason Lee · Haochuan Li · Liwei Wang · Xiyu Zhai -
2018 Poster: Towards Binary-Valued Gates for Robust LSTM Training »
Zhuohan Li · Di He · Fei Tian · Wei Chen · Tao Qin · Liwei Wang · Tie-Yan Liu -
2018 Poster: Scalable Bilinear Pi Learning Using State and Action Features »
Yichen Chen · Lihong Li · Mengdi Wang -
2018 Poster: SBEED: Convergent Reinforcement Learning with Nonlinear Function Approximation »
Bo Dai · Albert Shaw · Lihong Li · Lin Xiao · Niao He · Zhen Liu · Jianshu Chen · Le Song -
2018 Oral: Towards Binary-Valued Gates for Robust LSTM Training »
Zhuohan Li · Di He · Fei Tian · Wei Chen · Tao Qin · Liwei Wang · Tie-Yan Liu -
2018 Oral: Scalable Bilinear Pi Learning Using State and Action Features »
Yichen Chen · Lihong Li · Mengdi Wang -
2018 Oral: SBEED: Convergent Reinforcement Learning with Nonlinear Function Approximation »
Bo Dai · Albert Shaw · Lihong Li · Lin Xiao · Niao He · Zhen Liu · Jianshu Chen · Le Song -
2018 Poster: Dropout Training, Data-dependent Regularization, and Generalization Bounds »
Wenlong Mou · Yuchen Zhou · Jun Gao · Liwei Wang -
2018 Oral: Dropout Training, Data-dependent Regularization, and Generalization Bounds »
Wenlong Mou · Yuchen Zhou · Jun Gao · Liwei Wang -
2017 Poster: Collect at Once, Use Effectively: Making Non-interactive Locally Private Learning Possible »
Kai Zheng · Wenlong Mou · Liwei Wang -
2017 Talk: Collect at Once, Use Effectively: Making Non-interactive Locally Private Learning Possible »
Kai Zheng · Wenlong Mou · Liwei Wang -
2017 Poster: Stochastic Variance Reduction Methods for Policy Evaluation »
Simon Du · Jianshu Chen · Lihong Li · Lin Xiao · Dengyong Zhou -
2017 Talk: Stochastic Variance Reduction Methods for Policy Evaluation »
Simon Du · Jianshu Chen · Lihong Li · Lin Xiao · Dengyong Zhou -
2017 Poster: Provably Optimal Algorithms for Generalized Linear Contextual Bandits »
Lihong Li · Yu Lu · Dengyong Zhou -
2017 Talk: Provably Optimal Algorithms for Generalized Linear Contextual Bandits »
Lihong Li · Yu Lu · Dengyong Zhou