Timezone: »

Modelling Behavioural Diversity for Learning in Open-Ended Games
Nicolas Perez-Nieves · Yaodong Yang · Oliver Slumbers · David Mguni · Ying Wen · Jun Wang

Wed Jul 21 07:00 AM -- 07:20 AM (PDT) @

Promoting behavioural diversity is critical for solving games with non-transitive dynamics where strategic cycles exist, and there is no consistent winner (e.g., Rock-Paper-Scissors). Yet, there is a lack of rigorous treatment for defining diversity and constructing diversity-aware learning dynamics. In this work, we offer a geometric interpretation of behavioural diversity in games and introduce a novel diversity metric based on \emph{determinantal point processes} (DPP). By incorporating the diversity metric into best-response dynamics, we develop \emph{diverse fictitious play} and \emph{diverse policy-space response oracle} for solving normal-form games and open-ended games. We prove the uniqueness of the diverse best response and the convergence of our algorithms on two-player games. Importantly, we show that maximising the DPP-based diversity metric guarantees to enlarge the \emph{gamescape} -- convex polytopes spanned by agents' mixtures of strategies. To validate our diversity-aware solvers, we test on tens of games that show strong non-transitivity. Results suggest that our methods achieve at least the same, and in most games, lower exploitability than PSRO solvers by finding effective and diverse strategies.

Author Information

Nicolas Perez-Nieves (Imperial College London)
Yaodong Yang (Huawei UK)
Oliver Slumbers (UCL)
David Mguni (Noah's Ark Laboratory, Huawei)
Ying Wen (Shanghai Jiao Tong University)
Jun Wang (UCL)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors