Timezone: »
Although it has been known since the 1970s that a \textit{globally} optimal strategy profile in a common-payoff game is a Nash equilibrium, global optimality is a strict requirement that limits the result's applicability. In this work, we show that any \textit{locally} optimal symmetric strategy profile is also a (global) Nash equilibrium. Furthermore, we show that this result is robust to perturbations to the common payoff and to the local optimum. Applied to machine learning, our result provides a global guarantee for any gradient method that finds a local optimum in symmetric strategy space. While this result indicates stability to \textit{unilateral} deviation, we nevertheless identify broad classes of games where mixed local optima are unstable under \textit{joint}, asymmetric deviations. We analyze the prevalence of instability by running learning algorithms in a suite of symmetric games, and we conclude by discussing the applicability of our results to multi-agent RL, cooperative inverse RL, and decentralized POMDPs.
Author Information
Scott Emmons (UC Berkeley)
Caspar Oesterheld (Carnegie Mellon University)
Andrew Critch (UC Berkeley)
Vincent Conitzer (Duke)
Stuart Russell (UC Berkeley)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Spotlight: For Learning in Symmetric Teams, Local Optima are Global Nash Equilibria »
Wed. Jul 20th 03:00 -- 03:05 PM Room Room 310
More from the Same Authors
-
2020 : Contributed Talk: Incentive-Aware PAC Learning »
Hanrui Zhang · Vincent Conitzer -
2020 : Contributed Talk: Classification with Strategically Withheld Data »
Irving Rein · Hanrui Zhang · Vincent Conitzer -
2020 : Contributed Talk: Mitigating Manipulation in Peer Review via Randomized Reviewer Assignments »
Steven Jecmen · Hanrui Zhang · Ryan Liu · Nihar Shah · Vincent Conitzer -
2020 : Contributed Talk: Classification with Few Tests through Self-Selection »
Hanrui Zhang · Yu Cheng · Vincent Conitzer -
2023 Poster: Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the Machiavelli Benchmark »
Alexander Pan · Jun Shern Chan · Andy Zou · Nathaniel Li · Steven Basart · Thomas Woodside · Hanlin Zhang · Scott Emmons · Dan Hendrycks -
2023 Oral: Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the Machiavelli Benchmark »
Alexander Pan · Jun Shern Chan · Andy Zou · Nathaniel Li · Steven Basart · Thomas Woodside · Hanlin Zhang · Scott Emmons · Dan Hendrycks -
2022 Poster: Estimating and Penalizing Induced Preference Shifts in Recommender Systems »
Micah Carroll · Anca Dragan · Stuart Russell · Dylan Hadfield-Menell -
2022 Spotlight: Estimating and Penalizing Induced Preference Shifts in Recommender Systems »
Micah Carroll · Anca Dragan · Stuart Russell · Dylan Hadfield-Menell -
2020 Poster: Learning the Valuations of a $k$-demand Agent »
Hanrui Zhang · Vincent Conitzer -
2020 Poster: Learning Opinions in Social Networks »
Vincent Conitzer · Debmalya Panigrahi · Hanrui Zhang -
2019 Poster: Cognitive model priors for predicting human decisions »
Joshua C Peterson · David D Bourgin · Daniel Reichman · Thomas Griffiths · Stuart Russell -
2019 Oral: Cognitive model priors for predicting human decisions »
Joshua C Peterson · David D Bourgin · Daniel Reichman · Thomas Griffiths · Stuart Russell -
2019 Poster: When Samples Are Strategically Selected »
Hanrui Zhang · Yu Cheng · Vincent Conitzer -
2019 Oral: When Samples Are Strategically Selected »
Hanrui Zhang · Yu Cheng · Vincent Conitzer -
2018 Poster: An Efficient, Generalized Bellman Update For Cooperative Inverse Reinforcement Learning »
Dhruv Malik · Malayandi Palaniappan · Jaime Fisac · Dylan Hadfield-Menell · Stuart Russell · Anca Dragan -
2018 Oral: An Efficient, Generalized Bellman Update For Cooperative Inverse Reinforcement Learning »
Dhruv Malik · Malayandi Palaniappan · Jaime Fisac · Dylan Hadfield-Menell · Stuart Russell · Anca Dragan