Timezone: »
Promoting behavioural diversity is critical for solving games with non-transitive dynamics where strategic cycles exist, and there is no consistent winner (e.g., Rock-Paper-Scissors). Yet, there is a lack of rigorous treatment for defining diversity and constructing diversity-aware learning dynamics. In this work, we offer a geometric interpretation of behavioural diversity in games and introduce a novel diversity metric based on \emph{determinantal point processes} (DPP). By incorporating the diversity metric into best-response dynamics, we develop \emph{diverse fictitious play} and \emph{diverse policy-space response oracle} for solving normal-form games and open-ended games. We prove the uniqueness of the diverse best response and the convergence of our algorithms on two-player games. Importantly, we show that maximising the DPP-based diversity metric guarantees to enlarge the \emph{gamescape} -- convex polytopes spanned by agents' mixtures of strategies. To validate our diversity-aware solvers, we test on tens of games that show strong non-transitivity. Results suggest that our methods achieve at least the same, and in most games, lower exploitability than PSRO solvers by finding effective and diverse strategies.
Author Information
Nicolas Perez-Nieves (Imperial College London)
Yaodong Yang (Huawei UK)
Oliver Slumbers (UCL)
David Mguni (Noah's Ark Laboratory, Huawei)
Ying Wen (Shanghai Jiao Tong University)
Jun Wang (UCL)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Poster: Modelling Behavioural Diversity for Learning in Open-Ended Games »
Wed. Jul 21st 04:00 -- 06:00 PM Room
More from the Same Authors
-
2023 Poster: GEAR: A GPU-Centric Experience Replay System for Large Reinforcement Learning Models »
Hanjing Wang · Man-Kit Sit · Congjie He · Ying Wen · Weinan Zhang · Jun Wang · Yaodong Yang · Luo Mai -
2023 Poster: MANSA: Learning Fast and Slow in Multi-Agent Systems »
David Mguni · Haojun Chen · Taher Jafferjee · Jianhong Wang · Longfei Yue · Xidong Feng · Stephen Mcaleer · Feifei Tong · Jun Wang · Yaodong Yang -
2023 Poster: Regret-Minimizing Double Oracle for Extensive-Form Games »
Xiaohang Tang · Le Cong Dinh · Stephen Mcaleer · Yaodong Yang -
2023 Poster: A Game-Theoretic Framework for Managing Risk in Multi-Agent Systems »
Oliver Slumbers · David Mguni · Stefano Blumberg · Stephen Mcaleer · Yaodong Yang · Jun Wang -
2023 Poster: Cooperative Open-ended Learning Framework for Zero-Shot Coordination »
Yang Li · Shao Zhang · Jichen Sun · Yali Du · Ying Wen · Xinbing Wang · Wei Pan -
2022 Poster: Understanding Policy Gradient Algorithms: A Sensitivity-Based Approach »
Shuang Wu · Ling Shi · Jun Wang · Guangjian Tian -
2022 Poster: Plan Your Target and Learn Your Skills: Transferable State-Only Imitation Learning via Decoupled Policy Optimization »
Minghuan Liu · Zhengbang Zhu · Yuzheng Zhuang · Weinan Zhang · Jianye Hao · Yong Yu · Jun Wang -
2022 Spotlight: Understanding Policy Gradient Algorithms: A Sensitivity-Based Approach »
Shuang Wu · Ling Shi · Jun Wang · Guangjian Tian -
2022 Spotlight: Plan Your Target and Learn Your Skills: Transferable State-Only Imitation Learning via Decoupled Policy Optimization »
Minghuan Liu · Zhengbang Zhu · Yuzheng Zhuang · Weinan Zhang · Jianye Hao · Yong Yu · Jun Wang -
2022 Poster: Saute RL: Almost Surely Safe Reinforcement Learning Using State Augmentation »
Aivar Sootla · Alexander I Cowen-Rivers · Taher Jafferjee · Ziyan Wang · David Mguni · Jun Wang · Haitham Bou Ammar -
2022 Spotlight: Saute RL: Almost Surely Safe Reinforcement Learning Using State Augmentation »
Aivar Sootla · Alexander I Cowen-Rivers · Taher Jafferjee · Ziyan Wang · David Mguni · Jun Wang · Haitham Bou Ammar -
2022 Poster: Greedy when Sure and Conservative when Uncertain about the Opponents »
Haobo Fu · Ye Tian · Hongxiang Yu · Weiming Liu · Shuang Wu · Jiechao Xiong · Ying Wen · Kai Li · Junliang Xing · Qiang Fu · Wei Yang -
2022 Spotlight: Greedy when Sure and Conservative when Uncertain about the Opponents »
Haobo Fu · Ye Tian · Hongxiang Yu · Weiming Liu · Shuang Wu · Jiechao Xiong · Ying Wen · Kai Li · Junliang Xing · Qiang Fu · Wei Yang -
2021 Poster: Learning in Nonzero-Sum Stochastic Games with Potentials »
David Mguni · Yutong Wu · Yali Du · Yaodong Yang · Ziyi Wang · Minne Li · Ying Wen · Joel Jennings · Jun Wang -
2021 Poster: Estimating $\alpha$-Rank from A Few Entries with Low Rank Matrix Completion »
Yali Du · Xue Yan · Xu Chen · Jun Wang · Haifeng Zhang -
2021 Spotlight: Learning in Nonzero-Sum Stochastic Games with Potentials »
David Mguni · Yutong Wu · Yali Du · Yaodong Yang · Ziyi Wang · Minne Li · Ying Wen · Joel Jennings · Jun Wang -
2021 Spotlight: Estimating $\alpha$-Rank from A Few Entries with Low Rank Matrix Completion »
Yali Du · Xue Yan · Xu Chen · Jun Wang · Haifeng Zhang -
2020 Poster: Multi-Agent Determinantal Q-Learning »
Yaodong Yang · Ying Wen · Jun Wang · Liheng Chen · Kun Shao · David Mguni · Weinan Zhang -
2019 Poster: BayesNAS: A Bayesian Approach for Neural Architecture Search »
Hongpeng Zhou · Minghao Yang · Jun Wang · Wei Pan -
2019 Oral: BayesNAS: A Bayesian Approach for Neural Architecture Search »
Hongpeng Zhou · Minghao Yang · Jun Wang · Wei Pan -
2018 Poster: Mean Field Multi-Agent Reinforcement Learning »
Yaodong Yang · Rui Luo · Minne Li · Ming Zhou · Weinan Zhang · Jun Wang -
2018 Oral: Mean Field Multi-Agent Reinforcement Learning »
Yaodong Yang · Rui Luo · Minne Li · Ming Zhou · Weinan Zhang · Jun Wang