Skip to yearly menu bar Skip to main content


Poster

Sample-Efficient Multiagent Reinforcement Learning with Reset Replay

Yaodong Yang · Guangyong Chen · Jianye Hao · Pheng Ann Heng


Abstract:

The popularity of multiagent reinforcement learning (MARL) is growing rapidly with the demand for real-world tasks that require swarm intelligence. However, a noticeable drawback of MARL is its low sample efficiency, which leads to a huge amount of interactions with the environment. Moreover, when applying MARL to realistic tasks involving highly complex system dynamics, the parallel environment setting is usually enabled to accelerate sample collection. This common setting further necessitates the sample efficiency of MARL as the budget for environment interactions is limited. Surprisingly, few MARL works focus on this practical problem, which greatly hampers the application of MARL into the real world. In response to this gap, in this paper, we propose Multiagent Reinforcement Learning with Reset Replay (MARR) to greatly improve the sample efficiency of MARL by enabling MARL training at a high replay ratio in the parallel environment setting for the first time. To achieve this, first, a reset strategy is introduced for maintaining the network plasticity to ensure that MARL continually learns with a high replay ratio. Second, MARR incorporates a data augmentation technique to boost the sample efficiency further. MARR is general and easy to be plugged into mainstream off-policy MARL algorithms with only a slight modification. Extensive experiments in SMAC and MPE demonstrate that MARR significantly improves the performance of various MARL approaches with substantially fewer environment interactions.

Live content is unavailable. Log in and register to view live content