Timezone: »
Episodic memory-based methods can rapidly latch onto past successful strategies by a non-parametric memory and improve sample efficiency of traditional reinforcement learning. However, little effort is put into the continuous domain, where a state is never visited twice, and previous episodic methods fail to efficiently aggregate experience across trajectories. To address this problem, we propose Generalizable Episodic Memory (GEM), which effectively organizes the state-action values of episodic memory in a generalizable manner and supports implicit planning on memorized trajectories. GEM utilizes a double estimator to reduce the overestimation bias induced by value propagation in the planning process. Empirical evaluation shows that our method significantly outperforms existing trajectory-based methods on various MuJoCo continuous control tasks. To further show the general applicability, we evaluate our method on Atari games with discrete action space, which also shows a significant improvement over baseline algorithms.
Author Information
Hao Hu (Tsinghua University)
Jianing Ye (Peking University)
Guangxiang Zhu (Tsinghua University)
Zhizhou Ren (University of Illinois at Urbana-Champaign)
Chongjie Zhang (Tsinghua University)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Spotlight: Generalizable Episodic Memory for Deep Reinforcement Learning »
Wed. Jul 21st 02:35 -- 02:40 AM Room
More from the Same Authors
-
2022 Poster: Off-Policy Reinforcement Learning with Delayed Rewards »
Beining Han · Zhizhou Ren · Zuofan Wu · Yuan Zhou · Jian Peng -
2022 Poster: On the Role of Discount Factor in Offline Reinforcement Learning »
Hao Hu · yiqin yang · Qianchuan Zhao · Chongjie Zhang -
2022 Spotlight: On the Role of Discount Factor in Offline Reinforcement Learning »
Hao Hu · yiqin yang · Qianchuan Zhao · Chongjie Zhang -
2022 Spotlight: Off-Policy Reinforcement Learning with Delayed Rewards »
Beining Han · Zhizhou Ren · Zuofan Wu · Yuan Zhou · Jian Peng -
2022 Poster: Proximal Exploration for Model-guided Protein Sequence Design »
Zhizhou Ren · Jiahan Li · Fan Ding · Yuan Zhou · Jianzhu Ma · Jian Peng -
2022 Spotlight: Proximal Exploration for Model-guided Protein Sequence Design »
Zhizhou Ren · Jiahan Li · Fan Ding · Yuan Zhou · Jianzhu Ma · Jian Peng -
2022 Poster: Self-Organized Polynomial-Time Coordination Graphs »
Qianlan Yang · Weijun Dong · Zhizhou Ren · Jianhao Wang · Tonghan Wang · Chongjie Zhang -
2022 Poster: Individual Reward Assisted Multi-Agent Reinforcement Learning »
Li Wang · Yupeng Zhang · Yujing Hu · Weixun Wang · Chongjie Zhang · Yang Gao · Jianye Hao · Tangjie Lv · Changjie Fan -
2022 Spotlight: Individual Reward Assisted Multi-Agent Reinforcement Learning »
Li Wang · Yupeng Zhang · Yujing Hu · Weixun Wang · Chongjie Zhang · Yang Gao · Jianye Hao · Tangjie Lv · Changjie Fan -
2022 Spotlight: Self-Organized Polynomial-Time Coordination Graphs »
Qianlan Yang · Weijun Dong · Zhizhou Ren · Jianhao Wang · Tonghan Wang · Chongjie Zhang -
2021 Poster: MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration »
Jin Zhang · Jianhao Wang · Hao Hu · Tong Chen · Yingfeng Chen · Changjie Fan · Chongjie Zhang -
2021 Spotlight: MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration »
Jin Zhang · Jianhao Wang · Hao Hu · Tong Chen · Yingfeng Chen · Changjie Fan · Chongjie Zhang -
2020 Poster: ROMA: Multi-Agent Reinforcement Learning with Emergent Roles »
Tonghan Wang · Heng Dong · Victor Lesser · Chongjie Zhang