Timezone: »
In many real-world multi-agent systems, the sparsity of team rewards often makes it difficult for an algorithm to successfully learn a cooperative team policy. At present, the common way for solving this problem is to design some dense individual rewards for the agents to guide the cooperation. However, most existing works utilize individual rewards in ways that do not always promote teamwork and sometimes are even counterproductive. In this paper, we propose \emph{Individual Reward Assisted Team Policy Learning} (IRAT), which learns two policies for each agent from the dense individual reward and the sparse team reward with discrepancy constraints for updating the two policies mutually. Experimental results in different scenarios, such as the Multi-Agent Particle Environment and the Google Research Football Environment, show that IRAT significantly outperforms the baseline methods and can greatly promote team policy learning without deviating from the original team objective, even when the individual rewards are misleading or conflict with the team rewards.
Author Information
Li Wang (Nanjing University)
Yupeng Zhang (Nanjing University)
Yujing Hu (NetEase Fuxi AI Lab)
Weixun Wang (Tianjin University)
Chongjie Zhang (Tsinghua University)
Yang Gao (Nanjing University)
Jianye Hao (Tianjin University)
Tangjie Lv (NetEase Fuxi AI Lab)
Changjie Fan (NetEase Fuxi AI Lab)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Poster: Individual Reward Assisted Multi-Agent Reinforcement Learning »
Tue. Jul 19th through Wed the 20th Room Hall E #809
More from the Same Authors
-
2021 : Optimistic Exploration with Backward Bootstrapped Bonus for Deep Reinforcement Learning »
Chenjia Bai · Lingxiao Wang · Lei Han · Jianye Hao · Animesh Garg · Peng Liu · Zhaoran Wang -
2023 : Boosting Off-policy RL with Policy Representation and Policy-extended Value Function Approximator »
Min Zhang · Jianye Hao · Hongyao Tang · Yan Zheng -
2023 : A Policy-Decoupled Method for High-Quality Data Augmentation in Offline Reinforcement Learning »
Shixi Lian · Yi Ma · Jinyi Liu · Jianye Hao · Yan Zheng · Zhaopeng Meng -
2023 : Improving Offline-to-Online Reinforcement Learning with Q-Ensembles »
Kai Zhao · Yi Ma · Jinyi Liu · Jianye Hao · Yan Zheng · Zhaopeng Meng -
2023 : Prioritized Trajectory Replay: A Replay Memory for Data-driven Reinforcement Learning »
Jinyi Liu · Yi Ma · Jianye Hao · Yujing Hu · Yan Zheng · Tangjie Lv · Changjie Fan -
2023 Poster: RACE: Improve Multi-Agent Reinforcement Learning with Representation Asymmetry and Collaborative Evolution »
Pengyi Li · Jianye Hao · Hongyao Tang · Yan Zheng · Xian Fu -
2023 Poster: MetaDiffuser: Diffusion Model as Conditional Planner for Offline Meta-RL »
Fei Ni · Jianye Hao · Yao Mu · Yifu Yuan · Yan Zheng · Bin Wang · Zhixuan Liang -
2023 Poster: ChiPFormer: Transferable Chip Placement via Offline Decision Transformer »
Yao LAI · Jinxin Liu · Zhentao Tang · Bin Wang · Jianye Hao · Ping Luo -
2022 Poster: On the Role of Discount Factor in Offline Reinforcement Learning »
Hao Hu · yiqin yang · Qianchuan Zhao · Chongjie Zhang -
2022 Spotlight: On the Role of Discount Factor in Offline Reinforcement Learning »
Hao Hu · yiqin yang · Qianchuan Zhao · Chongjie Zhang -
2022 Poster: Self-Organized Polynomial-Time Coordination Graphs »
Qianlan Yang · Weijun Dong · Zhizhou Ren · Jianhao Wang · Tonghan Wang · Chongjie Zhang -
2022 Poster: Deconfounded Value Decomposition for Multi-Agent Reinforcement Learning »
Jiahui Li · Kun Kuang · Baoxiang Wang · Furui Liu · Long Chen · Changjie Fan · Fei Wu · Jun Xiao -
2022 Poster: PMIC: Improving Multi-Agent Reinforcement Learning with Progressive Mutual Information Collaboration »
Pengyi Li · Hongyao Tang · Tianpei Yang · Xiaotian Hao · Tong Sang · Yan Zheng · Jianye Hao · Matthew Taylor · Wenyuan Tao · Zhen Wang -
2022 Spotlight: Deconfounded Value Decomposition for Multi-Agent Reinforcement Learning »
Jiahui Li · Kun Kuang · Baoxiang Wang · Furui Liu · Long Chen · Changjie Fan · Fei Wu · Jun Xiao -
2022 Spotlight: Self-Organized Polynomial-Time Coordination Graphs »
Qianlan Yang · Weijun Dong · Zhizhou Ren · Jianhao Wang · Tonghan Wang · Chongjie Zhang -
2022 Spotlight: PMIC: Improving Multi-Agent Reinforcement Learning with Progressive Mutual Information Collaboration »
Pengyi Li · Hongyao Tang · Tianpei Yang · Xiaotian Hao · Tong Sang · Yan Zheng · Jianye Hao · Matthew Taylor · Wenyuan Tao · Zhen Wang -
2021 Poster: MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration »
Jin Zhang · Jianhao Wang · Hao Hu · Tong Chen · Yingfeng Chen · Changjie Fan · Chongjie Zhang -
2021 Spotlight: MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration »
Jin Zhang · Jianhao Wang · Hao Hu · Tong Chen · Yingfeng Chen · Changjie Fan · Chongjie Zhang -
2021 Poster: Principled Exploration via Optimistic Bootstrapping and Backward Induction »
Chenjia Bai · Lingxiao Wang · Lei Han · Jianye Hao · Animesh Garg · Peng Liu · Zhaoran Wang -
2021 Spotlight: Principled Exploration via Optimistic Bootstrapping and Backward Induction »
Chenjia Bai · Lingxiao Wang · Lei Han · Jianye Hao · Animesh Garg · Peng Liu · Zhaoran Wang -
2021 Poster: Generalizable Episodic Memory for Deep Reinforcement Learning »
Hao Hu · Jianing Ye · Guangxiang Zhu · Zhizhou Ren · Chongjie Zhang -
2021 Spotlight: Generalizable Episodic Memory for Deep Reinforcement Learning »
Hao Hu · Jianing Ye · Guangxiang Zhu · Zhizhou Ren · Chongjie Zhang -
2020 Poster: ROMA: Multi-Agent Reinforcement Learning with Emergent Roles »
Tonghan Wang · Heng Dong · Victor Lesser · Chongjie Zhang -
2020 Poster: Q-value Path Decomposition for Deep Multiagent Reinforcement Learning »
Yaodong Yang · Jianye Hao · Guangyong Chen · Hongyao Tang · Yingfeng Chen · Yujing Hu · Changjie Fan · Zhongyu Wei -
2020 Poster: Dynamic Knapsack Optimization Towards Efficient Multi-Channel Sequential Advertising »
Xiaotian Hao · Zhaoqing Peng · Yi Ma · Guan Wang · Junqi Jin · Jianye Hao · Shan Chen · Rongquan Bai · Mingzhou Xie · Miao Xu · Zhenzhe Zheng · Chuan Yu · HAN LI · Jian Xu · Kun Gai