Timezone: »
Poster
Information-Theoretic Considerations in Batch Reinforcement Learning
Jinglin Chen · Nan Jiang
Value-function approximation methods that operate in batch mode have foundational importance to reinforcement learning (RL). Finite sample guarantees for these methods often crucially rely on two types of assumptions: (1) mild distribution shift, and (2) representation conditions that are stronger than realizability. However, the necessity (“why do we need them?”) and the naturalness (“when do they hold?”) of such assumptions have largely eluded the literature. In this paper, we revisit these assumptions and provide theoretical results towards answering the above questions, and make steps towards a deeper understanding of value-function approximation.
Author Information
Jinglin Chen (University of Illinois at Urbana-Champaign)
Nan Jiang (University of Illinois at Urbana-Champaign)
Related Events (a corresponding poster, oral, or spotlight)
-
2019 Oral: Information-Theoretic Considerations in Batch Reinforcement Learning »
Wed. Jun 12th 12:15 -- 12:20 AM Room Room 103
More from the Same Authors
-
2021 : Nonstationary Reinforcement Learning with Linear Function Approximation »
Huozhi Zhou · Jinglin Chen · Lav Varshney · Ashish Jagmohan -
2021 : A Spectral Approach to Off-Policy Evaluation for POMDPs »
Yash Nair · Nan Jiang -
2021 : Policy Finetuning: Bridging Sample-Efficient Offline and Online Reinforcement Learning »
Tengyang Xie · Nan Jiang · Huan Wang · Caiming Xiong · Yu Bai -
2022 : Interaction-Grounded Learning with Action-inclusive Feedback »
Tengyang Xie · Akanksha Saran · Dylan Foster · Lekan Molu · Ida Momennejad · Nan Jiang · Paul Mineiro · John Langford -
2022 : Beyond the Return: Off-policy Function Estimation under User-specified Error-measuring Distributions »
Audrey Huang · Nan Jiang -
2023 Poster: Offline Learning in Markov Games with General Function Approximation »
Yuheng Zhang · Yu Bai · Nan Jiang -
2023 Poster: The Optimal Approximation Factors in Misspecified Off-Policy Value Function Estimation »
Philip Amortila · Nan Jiang · Csaba Szepesvari -
2023 Poster: Reinforcement Learning in Low-rank MDPs with Density Features »
Audrey Huang · Jinglin Chen · Nan Jiang -
2022 Poster: Adversarially Trained Actor Critic for Offline Reinforcement Learning »
Ching-An Cheng · Tengyang Xie · Nan Jiang · Alekh Agarwal -
2022 Oral: Adversarially Trained Actor Critic for Offline Reinforcement Learning »
Ching-An Cheng · Tengyang Xie · Nan Jiang · Alekh Agarwal -
2022 Poster: A Minimax Learning Approach to Off-Policy Evaluation in Confounded Partially Observable Markov Decision Processes »
Chengchun Shi · Masatoshi Uehara · Jiawei Huang · Nan Jiang -
2022 Oral: A Minimax Learning Approach to Off-Policy Evaluation in Confounded Partially Observable Markov Decision Processes »
Chengchun Shi · Masatoshi Uehara · Jiawei Huang · Nan Jiang -
2021 Poster: Batch Value-function Approximation with Only Realizability »
Tengyang Xie · Nan Jiang -
2021 Spotlight: Batch Value-function Approximation with Only Realizability »
Tengyang Xie · Nan Jiang -
2020 Poster: Minimax Weight and Q-Function Learning for Off-Policy Evaluation »
Masatoshi Uehara · Jiawei Huang · Nan Jiang -
2020 Poster: From Importance Sampling to Doubly Robust Policy Gradient »
Jiawei Huang · Nan Jiang -
2019 Poster: Provably efficient RL with Rich Observations via Latent State Decoding »
Simon Du · Akshay Krishnamurthy · Nan Jiang · Alekh Agarwal · Miroslav Dudik · John Langford -
2019 Oral: Provably efficient RL with Rich Observations via Latent State Decoding »
Simon Du · Akshay Krishnamurthy · Nan Jiang · Alekh Agarwal · Miroslav Dudik · John Langford