Timezone: »
Spotlight
On the Role of Discount Factor in Offline Reinforcement Learning
Hao Hu · yiqin yang · Qianchuan Zhao · Chongjie Zhang
Offline reinforcement learning (RL) enables effective learning from previously collected data without exploration, which shows great promise in real-world applications when exploration is expensive or even infeasible. The discount factor, $\gamma$, plays a vital role in improving online RL sample efficiency and estimation accuracy, but the role of the discount factor in offline RL is not well explored. This paper examines two distinct effects of $\gamma$ in offline RL with theoretical analysis, namely the regularization effect and the pessimism effect. On the one hand, $\gamma$ is a regulator to trade-off optimality with sample efficiency upon existing offline techniques. On the other hand, lower guidance $\gamma$ can also be seen as a way of pessimism where we optimize the policy's performance in the worst possible models. We empirically verify the above theoretical observation with tabular MDPs and standard D4RL tasks. The results show that the discount factor plays an essential role in the performance of offline RL algorithms, both under small data regimes upon existing offline methods and in large data regimes without other conservative methods.
Author Information
Hao Hu (Tsinghua University)
yiqin yang (Tsinghua University)
Qianchuan Zhao (Tsinghua University)
Chongjie Zhang (Tsinghua University)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Poster: On the Role of Discount Factor in Offline Reinforcement Learning »
Thu. Jul 21st through Fri the 22nd Room Hall E #924
More from the Same Authors
-
2022 Poster: Self-Organized Polynomial-Time Coordination Graphs »
Qianlan Yang · Weijun Dong · Zhizhou Ren · Jianhao Wang · Tonghan Wang · Chongjie Zhang -
2022 Poster: Individual Reward Assisted Multi-Agent Reinforcement Learning »
Li Wang · Yupeng Zhang · Yujing Hu · Weixun Wang · Chongjie Zhang · Yang Gao · Jianye Hao · Tangjie Lv · Changjie Fan -
2022 Spotlight: Individual Reward Assisted Multi-Agent Reinforcement Learning »
Li Wang · Yupeng Zhang · Yujing Hu · Weixun Wang · Chongjie Zhang · Yang Gao · Jianye Hao · Tangjie Lv · Changjie Fan -
2022 Spotlight: Self-Organized Polynomial-Time Coordination Graphs »
Qianlan Yang · Weijun Dong · Zhizhou Ren · Jianhao Wang · Tonghan Wang · Chongjie Zhang -
2021 Poster: MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration »
Jin Zhang · Jianhao Wang · Hao Hu · Tong Chen · Yingfeng Chen · Changjie Fan · Chongjie Zhang -
2021 Spotlight: MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration »
Jin Zhang · Jianhao Wang · Hao Hu · Tong Chen · Yingfeng Chen · Changjie Fan · Chongjie Zhang -
2021 Poster: Generalizable Episodic Memory for Deep Reinforcement Learning »
Hao Hu · Jianing Ye · Guangxiang Zhu · Zhizhou Ren · Chongjie Zhang -
2021 Spotlight: Generalizable Episodic Memory for Deep Reinforcement Learning »
Hao Hu · Jianing Ye · Guangxiang Zhu · Zhizhou Ren · Chongjie Zhang -
2020 Poster: ROMA: Multi-Agent Reinforcement Learning with Emergent Roles »
Tonghan Wang · Heng Dong · Victor Lesser · Chongjie Zhang