Multi-Agent Training beyond Zero-Sum with Correlated Equilibrium Meta-Solvers

Luke Marris · Paul Muller · Marc Lanctot · Karl Tuyls · Thore Graepel


Keywords: [ Reinforcement Learning and Planning ] [ Multi-Agent RL ]

[ Abstract ]
[ Slides
[ Paper ]
[ Visit Poster at Spot C3 in Virtual World ]
Tue 20 Jul 9 a.m. PDT — 11 a.m. PDT
Spotlight presentation: Reinforcement Learning (Multi-agent)
Tue 20 Jul 5 a.m. PDT — 6 a.m. PDT


Two-player, constant-sum games are well studied in the literature, but there has been limited progress outside of this setting. We propose Joint Policy-Space Response Oracles (JPSRO), an algorithm for training agents in n-player, general-sum extensive form games, which provably converges to an equilibrium. We further suggest correlated equilibria (CE) as promising meta-solvers, and propose a novel solution concept Maximum Gini Correlated Equilibrium (MGCE), a principled and computationally efficient family of solutions for solving the correlated equilibrium selection problem. We conduct several experiments using CE meta-solvers for JPSRO and demonstrate convergence on n-player, general-sum games.

Chat is not available.