Timezone: »
In recent years, data-driven reinforcement learning (RL), also known as offline RL, have gained significant attention. However, the role of data sampling techniques in offline RL has been overlooked despite its potential to enhance online RL performance. Recent research suggests applying sampling techniques directly to state-transitions does not consistently improve performance in offline RL. Therefore, in this study, we propose a memory technique, (Prioritized) Trajectory Replay (TR/PTR), which extends the sampling perspective to trajectories for more comprehensive information extraction from limited data. TR enhances learning efficiency by backward sampling of trajectories that optimizes the use of subsequent state information. Building on TR, we build the weighted critic target to avoid sampling unseen actions in offline training, and Prioritized Trajectory Replay (PTR) that enables more efficient trajectory sampling, prioritized by various trajectory priority metrics. We demonstrate the benefits of integrating TR and PTR with existing offline RL algorithms on D4RL. In summary, our research emphasizes the significance of trajectory-based data sampling techniques in enhancing the efficiency and performance of offline RL algorithms.
Author Information
Jinyi Liu
Yi Ma (Tianjin University)
Jianye Hao (Tianjin University)
Yujing Hu (NetEase Fuxi AI Lab)
Yan Zheng (Tianjin University, Nanyang Technical University)
Tangjie Lv (NetEase Fuxi AI Lab)
Changjie Fan (NetEase Fuxi AI Lab)
More from the Same Authors
-
2021 : Optimistic Exploration with Backward Bootstrapped Bonus for Deep Reinforcement Learning »
Chenjia Bai · Lingxiao Wang · Lei Han · Jianye Hao · Animesh Garg · Peng Liu · Zhaoran Wang -
2023 : Limited Information Opponent Modeling »
Yongliang Lv · Yan Zheng · jianye Hao -
2023 : Boosting Off-policy RL with Policy Representation and Policy-extended Value Function Approximator »
Min Zhang · Jianye Hao · Hongyao Tang · Yan Zheng -
2023 : A Policy-Decoupled Method for High-Quality Data Augmentation in Offline Reinforcement Learning »
Shixi Lian · Yi Ma · Jinyi Liu · Jianye Hao · Yan Zheng · Zhaopeng Meng -
2023 : Improving Offline-to-Online Reinforcement Learning with Q-Ensembles »
Kai Zhao · Yi Ma · Jinyi Liu · Jianye Hao · Yan Zheng · Zhaopeng Meng -
2023 Poster: RACE: Improve Multi-Agent Reinforcement Learning with Representation Asymmetry and Collaborative Evolution »
Pengyi Li · Jianye Hao · Hongyao Tang · Yan Zheng · Xian Fu -
2023 Poster: MetaDiffuser: Diffusion Model as Conditional Planner for Offline Meta-RL »
Fei Ni · Jianye Hao · Yao Mu · Yifu Yuan · Yan Zheng · Bin Wang · Zhixuan Liang -
2023 Poster: ChiPFormer: Transferable Chip Placement via Offline Decision Transformer »
Yao LAI · Jinxin Liu · Zhentao Tang · Bin Wang · Jianye Hao · Ping Luo -
2022 Poster: Individual Reward Assisted Multi-Agent Reinforcement Learning »
Li Wang · Yupeng Zhang · Yujing Hu · Weixun Wang · Chongjie Zhang · Yang Gao · Jianye Hao · Tangjie Lv · Changjie Fan -
2022 Poster: Deconfounded Value Decomposition for Multi-Agent Reinforcement Learning »
Jiahui Li · Kun Kuang · Baoxiang Wang · Furui Liu · Long Chen · Changjie Fan · Fei Wu · Jun Xiao -
2022 Poster: PMIC: Improving Multi-Agent Reinforcement Learning with Progressive Mutual Information Collaboration »
Pengyi Li · Hongyao Tang · Tianpei Yang · Xiaotian Hao · Tong Sang · Yan Zheng · Jianye Hao · Matthew Taylor · Wenyuan Tao · Zhen Wang -
2022 Spotlight: Deconfounded Value Decomposition for Multi-Agent Reinforcement Learning »
Jiahui Li · Kun Kuang · Baoxiang Wang · Furui Liu · Long Chen · Changjie Fan · Fei Wu · Jun Xiao -
2022 Spotlight: Individual Reward Assisted Multi-Agent Reinforcement Learning »
Li Wang · Yupeng Zhang · Yujing Hu · Weixun Wang · Chongjie Zhang · Yang Gao · Jianye Hao · Tangjie Lv · Changjie Fan -
2022 Spotlight: PMIC: Improving Multi-Agent Reinforcement Learning with Progressive Mutual Information Collaboration »
Pengyi Li · Hongyao Tang · Tianpei Yang · Xiaotian Hao · Tong Sang · Yan Zheng · Jianye Hao · Matthew Taylor · Wenyuan Tao · Zhen Wang -
2021 Poster: MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration »
Jin Zhang · Jianhao Wang · Hao Hu · Tong Chen · Yingfeng Chen · Changjie Fan · Chongjie Zhang -
2021 Spotlight: MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration »
Jin Zhang · Jianhao Wang · Hao Hu · Tong Chen · Yingfeng Chen · Changjie Fan · Chongjie Zhang -
2021 Poster: Principled Exploration via Optimistic Bootstrapping and Backward Induction »
Chenjia Bai · Lingxiao Wang · Lei Han · Jianye Hao · Animesh Garg · Peng Liu · Zhaoran Wang -
2021 Spotlight: Principled Exploration via Optimistic Bootstrapping and Backward Induction »
Chenjia Bai · Lingxiao Wang · Lei Han · Jianye Hao · Animesh Garg · Peng Liu · Zhaoran Wang -
2020 Poster: Q-value Path Decomposition for Deep Multiagent Reinforcement Learning »
Yaodong Yang · Jianye Hao · Guangyong Chen · Hongyao Tang · Yingfeng Chen · Yujing Hu · Changjie Fan · Zhongyu Wei -
2020 Poster: Dynamic Knapsack Optimization Towards Efficient Multi-Channel Sequential Advertising »
Xiaotian Hao · Zhaoqing Peng · Yi Ma · Guan Wang · Junqi Jin · Jianye Hao · Shan Chen · Rongquan Bai · Mingzhou Xie · Miao Xu · Zhenzhe Zheng · Chuan Yu · HAN LI · Jian Xu · Kun Gai