Timezone: »
Offline reinforcement learning (RL) offers a promising direction for learning policies from pre-collected datasets without requiring further interactions with the environment. However, existing methods struggle to handle out-of-distribution (OOD) extrapolation errors, especially in sparse reward or scarce data settings. In this paper, we propose a novel training algorithm called Conservative Density Estimation (CDE), which addresses this challenge by explicitly imposing constraints on the state-action occupancy stationary distribution. CDE overcomes the limitations of existing approaches, such as the stationary distribution correction method, by addressing the support mismatch issue in marginal importance sampling. Our method achieves state-of-the-art performance on the D4RL benchmark. Notably, CDE consistently outperforms baselines in challenging tasks with sparse rewards or insufficient data, demonstrating the advantages of our approach in addressing the extrapolation error problem in offline RL.
Author Information
Zhepeng Cen (Carnegie Mellon University)
Zuxin Liu (Carnegie Mellon University)
Zitong Wang (Columbia University)
Yihang Yao (Carnegie Mellon University)
Henry Lam (Columbia University)
Ding Zhao (Carnegie Mellon University)
More from the Same Authors
-
2022 : Group Distributionally Robust Reinforcement Learning with Hierarchical Latent Variables »
Mengdi Xu · Peide Huang · Visak Kumar · Jielin Qiu · Chao Fang · Kuan-Hui Lee · Xuewei Qi · Henry Lam · Bo Li · Ding Zhao -
2022 : Paper 22: Multimodal Unsupervised Car Segmentation via Adaptive Aerial Image-to-Image Translation »
Haohong Lin · Zhepeng Cen · Peide Huang · Hanjiang Hu -
2022 : Paper 2: SeasonDepth: Cross-Season Monocular Depth Prediction Dataset and Benchmark under Multiple Environments »
Ding Zhao · Hitesh Arora · Jiacheng Zhu · Zuxin Liu · Wenhao Ding -
2022 : Paper 10: CausalAF: Causal Autoregressive Flow for Safety-Critical Scenes Generation »
Wenhao Ding · Haohong Lin · Bo Li · Ding Zhao · Hitesh Arora -
2023 : DiffScene: Diffusion-Based Safety-Critical Scenario Generation for Autonomous Vehicles »
Chejian Xu · Ding Zhao · Alberto Sngiovanni Vincentelli · Bo Li -
2023 : Seeing is not Believing: Robust Reinforcement Learning against Spurious Correlation »
Wenhao Ding · Laixi Shi · Yuejie Chi · Ding Zhao -
2023 : Offline Reinforcement Learning with Imbalanced Datasets »
Li Jiang · Sijie Cheng · Jielin Qiu · Victor Chan · Ding Zhao -
2023 : Semantically Adversarial Scene Generation with Explicit Knowledge Guidance for Autonomous Driving »
Wenhao Ding · Haohong Lin · Bo Li · Ding Zhao -
2023 : Learning Shared Safety Constraints from Multi-task Demonstrations »
Konwoo Kim · Gokul Swamy · Zuxin Liu · Ding Zhao · Sanjiban Choudhury · Steven Wu -
2023 : Visual-based Policy Learning with Latent Language Encoding »
Jielin Qiu · Mengdi Xu · William Han · Bo Li · Ding Zhao -
2023 : Can Brain Signals Reveal Inner Alignment with Human Languages? »
Jielin Qiu · William Han · Jiacheng Zhu · Mengdi Xu · Douglas Weber · Bo Li · Ding Zhao -
2023 : Multimodal Representation Learning of Cardiovascular Magnetic Resonance Imaging »
Jielin Qiu · Peide Huang · Makiya Nakashima · Jaehyun Lee · Jiacheng Zhu · Wilson Tang · Pohao Chen · Christopher Nguyen · Byung-Hak Kim · Debbie Kwon · Douglas Weber · Ding Zhao · David Chen -
2023 : Robustness Verification for Perception Models against Camera Motion Perturbations »
Hanjiang Hu · Changliu Liu · Ding Zhao -
2023 : Learning Shared Safety Constraints from Multi-task Demonstrations »
Konwoo Kim · Gokul Swamy · Zuxin Liu · Ding Zhao · Sanjiban Choudhury · Steven Wu -
2023 Poster: Bootstrap in High Dimension with Low Computation »
Henry Lam · Zhenyuan Liu -
2023 Poster: Constrained Decision Transformer for Offline Safe Reinforcement Learning »
Zuxin Liu · Zijian Guo · Yihang Yao · Zhepeng Cen · Wenhao Yu · Tingnan Zhang · Ding Zhao -
2023 Poster: Towards Robust and Safe Reinforcement Learning with Benign Off-policy Data »
Zuxin Liu · Zijian Guo · Zhepeng Cen · Huan Zhang · Yihang Yao · Hanjiang Hu · Ding Zhao -
2023 Poster: Bayesian Reparameterization of Reward-Conditioned Reinforcement Learning with Energy-based Models »
Wenhao Ding · Tong Che · Ding Zhao · Marco Pavone -
2023 Poster: Interpolation for Robust Learning: Data Augmentation on Wasserstein Geodesics »
Jiacheng Zhu · Jielin Qiu · Aritra Guha · Zhuolin Yang · XuanLong Nguyen · Bo Li · Ding Zhao -
2022 : Paper 15: On the Robustness of Safe Reinforcement Learning under Observational Perturbations »
Zuxin Liu · Zhepeng Cen · Huan Zhang · Jie Tan · Bo Li · Ding Zhao -
2022 : Paper 16: Constrained Model-based Reinforcement Learning via Robust Planning »
Zuxin Liu · Ding Zhao -
2022 Poster: Constrained Variational Policy Optimization for Safe Reinforcement Learning »
Zuxin Liu · Zhepeng Cen · Vladislav Isenbaev · Wei Liu · Steven Wu · Bo Li · Ding Zhao -
2022 Spotlight: Constrained Variational Policy Optimization for Safe Reinforcement Learning »
Zuxin Liu · Zhepeng Cen · Vladislav Isenbaev · Wei Liu · Steven Wu · Bo Li · Ding Zhao