Timezone: »
Finding the minimal structural assumptions that empower sample-efficient learning is one of the most important research directions in Reinforcement Learning (RL). This paper advances our understanding of this fundamental question by introducing a new complexity measure—Bellman Eluder (BE) dimension. We show that the family of RL problems of low BE dimension is remarkably rich, which subsumes a vast majority of existing tractable RL problems including but not limited to tabular MDPs, linear MDPs, reactive POMDPs, low Bellman rank problems as well as low Eluder dimension problems. This paper further designs a new optimization-based algorithm— GOLF, and reanalyzes a hypothesis elimination-based algorithm—OLIVE (proposed in Jiang et al., 2017). We prove that both algorithms learn the near-optimal policies of low BE dimension problems in a number of samples that is polynomial in all relevant parameters, but independent of the size of state-action space. Our regret and sample complexity results match or improve the best existing results for several well-known subclasses of low BE dimension problems.
Author Information
Chi Jin (Princeton University)
Qinghua Liu (Princeton University)
Sobhan Miryoosefi (Princeton University)
More from the Same Authors
-
2021 : The Power of Exploiter: Provable Multi-Agent RL in Large State Spaces »
Chi Jin · Qinghua Liu · Tiancheng Yu -
2021 : Sample-Efficient Learning of Stackelberg Equilibria in General-Sum Games »
Yu Bai · Chi Jin · Huan Wang · Caiming Xiong -
2023 : Is RLHF More Difficult than Standard RL? »
Chi Jin -
2023 Poster: Efficient displacement convex optimization with particle gradient descent »
Hadi Daneshmand · Jason Lee · Chi Jin -
2022 Poster: A Simple Reward-free Approach to Constrained Reinforcement Learning »
Sobhan Miryoosefi · Chi Jin -
2022 Spotlight: A Simple Reward-free Approach to Constrained Reinforcement Learning »
Sobhan Miryoosefi · Chi Jin -
2022 Poster: The Power of Exploiter: Provable Multi-Agent RL in Large State Spaces »
Chi Jin · Qinghua Liu · Tiancheng Yu -
2022 Poster: Learning Markov Games with Adversarial Opponents: Efficient Algorithms and Fundamental Limits »
Qinghua Liu · Yuanhao Wang · Chi Jin -
2022 Poster: Near-Optimal Learning of Extensive-Form Games with Imperfect Information »
Yu Bai · Chi Jin · Song Mei · Tiancheng Yu -
2022 Spotlight: Near-Optimal Learning of Extensive-Form Games with Imperfect Information »
Yu Bai · Chi Jin · Song Mei · Tiancheng Yu -
2022 Oral: Learning Markov Games with Adversarial Opponents: Efficient Algorithms and Fundamental Limits »
Qinghua Liu · Yuanhao Wang · Chi Jin -
2022 Spotlight: The Power of Exploiter: Provable Multi-Agent RL in Large State Spaces »
Chi Jin · Qinghua Liu · Tiancheng Yu -
2022 Poster: Provable Reinforcement Learning with a Short-Term Memory »
Yonathan Efroni · Chi Jin · Akshay Krishnamurthy · Sobhan Miryoosefi -
2022 Spotlight: Provable Reinforcement Learning with a Short-Term Memory »
Yonathan Efroni · Chi Jin · Akshay Krishnamurthy · Sobhan Miryoosefi -
2021 : Sample-Efficient Learning of Stackelberg Equilibria in General-Sum Games »
Yu Bai · Chi Jin · Huan Wang · Caiming Xiong -
2021 Poster: Near-Optimal Representation Learning for Linear Bandits and Linear RL »
Jiachen Hu · Xiaoyu Chen · Chi Jin · Lihong Li · Liwei Wang -
2021 Poster: A Sharp Analysis of Model-based Reinforcement Learning with Self-Play »
Qinghua Liu · Tiancheng Yu · Yu Bai · Chi Jin -
2021 Poster: Provable Meta-Learning of Linear Representations »
Nilesh Tripuraneni · Chi Jin · Michael Jordan -
2021 Spotlight: Provable Meta-Learning of Linear Representations »
Nilesh Tripuraneni · Chi Jin · Michael Jordan -
2021 Spotlight: A Sharp Analysis of Model-based Reinforcement Learning with Self-Play »
Qinghua Liu · Tiancheng Yu · Yu Bai · Chi Jin -
2021 Spotlight: Near-Optimal Representation Learning for Linear Bandits and Linear RL »
Jiachen Hu · Xiaoyu Chen · Chi Jin · Lihong Li · Liwei Wang -
2021 Poster: Risk Bounds and Rademacher Complexity in Batch Reinforcement Learning »
Yaqi Duan · Chi Jin · Zhiyuan Li -
2021 Spotlight: Risk Bounds and Rademacher Complexity in Batch Reinforcement Learning »
Yaqi Duan · Chi Jin · Zhiyuan Li -
2020 Poster: On Gradient Descent Ascent for Nonconvex-Concave Minimax Problems »
Tianyi Lin · Chi Jin · Michael Jordan -
2020 Poster: Reward-Free Exploration for Reinforcement Learning »
Chi Jin · Akshay Krishnamurthy · Max Simchowitz · Tiancheng Yu -
2020 Poster: Provable Self-Play Algorithms for Competitive Reinforcement Learning »
Yu Bai · Chi Jin -
2020 Poster: Learning Adversarial Markov Decision Processes with Bandit Feedback and Unknown Transition »
Chi Jin · Tiancheng Jin · Haipeng Luo · Suvrit Sra · Tiancheng Yu -
2020 Poster: Provably Efficient Exploration in Policy Optimization »
Qi Cai · Zhuoran Yang · Chi Jin · Zhaoran Wang -
2020 Poster: What is Local Optimality in Nonconvex-Nonconcave Minimax Optimization? »
Chi Jin · Praneeth Netrapalli · Michael Jordan