MASER: Multi-Agent Reinforcement Learning with Subgoals Generated from Experience Replay Buffer

JEON JEEWON · WOOJUN KIM · Whiyoung Jung · Youngchul Sung

Hall E #813

Keywords: [ RL: Multi-agent ] [ Reinforcement Learning ]

[ Abstract ]
[ Poster [ Paper PDF
Wed 20 Jul 3:30 p.m. PDT — 5:30 p.m. PDT
Spotlight presentation: Reinforcement Learning
Wed 20 Jul 10:15 a.m. PDT — 11:45 a.m. PDT


In this paper, we consider cooperative multi-agent reinforcement learning (MARL) with sparse reward. To tackle this problem, we propose a novel method named MASER: MARL with subgoals generated from experience replay buffer. Under the widely-used assumption of centralized training with decentralized execution and consistent Q-value decomposition for MARL, MASER automatically generates proper subgoals for multiple agents from the experience replay buffer by considering both individual Q-value and total Q-value. Then, MASER designs individual intrinsic reward for each agent based on actionable representation relevant to Q-learning so that the agents reach their subgoals while maximizing the joint action value. Numerical results show that MASER significantly outperforms StarCraft II micromanagement benchmark compared to other state-of-the-art MARL algorithms.

Chat is not available.