Timezone: »
Oral
Independent Policy Gradient for Large-Scale Markov Potential Games: Sharper Rates, Function Approximation, and Game-Agnostic Convergence
Dongsheng Ding · Chen-Yu Wei · Kaiqing Zhang · Mihailo Jovanovic
We examine global non-asymptotic convergence properties of policy gradient methods for multi-agent reinforcement learning (RL) problems in Markov potential games (MPGs). To learn a Nash equilibrium of an MPG in which the size of state space and/or the number of players can be very large, we propose new independent policy gradient algorithms that are run by all players in tandem. When there is no uncertainty in the gradient evaluation, we show that our algorithm finds an $\epsilon$-Nash equilibrium with $O(1/\epsilon^2)$ iteration complexity which does not explicitly depend on the state space size. When the exact gradient is not available, we establish $O(1/\epsilon^5)$ sample complexity bound in a potentially infinitely large state space for a sample-based algorithm that utilizes function approximation. Moreover, we identify a class of independent policy gradient algorithms that enjoy convergence for both zero-sum Markov games and Markov cooperative games with the players that are oblivious to the types of games being played. Finally, we provide computational experiments to corroborate the merits and the effectiveness of our theoretical developments.
Author Information
Dongsheng Ding (University of Southern California)
Chen-Yu Wei (University of Southern California)
Kaiqing Zhang (MIT)
Mihailo Jovanovic (University of Southern California)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Poster: Independent Policy Gradient for Large-Scale Markov Potential Games: Sharper Rates, Function Approximation, and Game-Agnostic Convergence »
Tue. Jul 19th through Wed the 20th Room Hall E #813
More from the Same Authors
-
2021 : Derivative-Free Policy Optimization for Linear Risk-Sensitive and Robust Control Design: Implicit Regularization and Sample Complexity »
Kaiqing Zhang · Xiangyuan Zhang · Bin Hu · Tamer Basar -
2021 : Decentralized Q-Learning in Zero-sum Markov Games »
Kaiqing Zhang · David Leslie · Tamer Basar · Asuman Ozdaglar -
2021 : Policy Optimization in Adversarial MDPs: Improved Exploration via Dilated Bonuses »
Haipeng Luo · Chen-Yu Wei · Chung-Wei Lee -
2023 : Uncoupled and Convergent Learning in Two-Player Zero-Sum Markov Games »
Yang Cai · Haipeng Luo · Chen-Yu Wei · Weiqiang Zheng -
2023 Poster: Best of Both Worlds Policy Optimization »
Christoph Dann · Chen-Yu Wei · Julian Zimmert -
2023 Oral: Best of Both Worlds Policy Optimization »
Christoph Dann · Chen-Yu Wei · Julian Zimmert -
2023 Poster: Refined Regret for Adversarial MDPs with Linear Function Approximation »
Yan Dai · Haipeng Luo · Chen-Yu Wei · Julian Zimmert -
2022 : What is a Good Metric to Study Generalization of Minimax Learners? »
Asuman Ozdaglar · Sarath Pattathil · Jiawei Zhang · Kaiqing Zhang -
2022 Poster: On Improving Model-Free Algorithms for Decentralized Multi-Agent Reinforcement Learning »
Weichao Mao · Lin Yang · Kaiqing Zhang · Tamer Basar -
2022 Poster: Do Differentiable Simulators Give Better Policy Gradients? »
Hyung Ju Suh · Max Simchowitz · Kaiqing Zhang · Russ Tedrake -
2022 Spotlight: On Improving Model-Free Algorithms for Decentralized Multi-Agent Reinforcement Learning »
Weichao Mao · Lin Yang · Kaiqing Zhang · Tamer Basar -
2022 Oral: Do Differentiable Simulators Give Better Policy Gradients? »
Hyung Ju Suh · Max Simchowitz · Kaiqing Zhang · Russ Tedrake -
2022 Poster: Personalization Improves Privacy-Accuracy Tradeoffs in Federated Learning »
Alberto Bietti · Chen-Yu Wei · Miroslav Dudik · John Langford · Steven Wu -
2022 Spotlight: Personalization Improves Privacy-Accuracy Tradeoffs in Federated Learning »
Alberto Bietti · Chen-Yu Wei · Miroslav Dudik · John Langford · Steven Wu -
2021 Poster: Near-Optimal Model-Free Reinforcement Learning in Non-Stationary Episodic MDPs »
Weichao Mao · Kaiqing Zhang · Ruihao Zhu · David Simchi-Levi · Tamer Basar -
2021 Poster: Achieving Near Instance-Optimality and Minimax-Optimality in Stochastic and Adversarial Linear Bandits Simultaneously »
Chung-Wei Lee · Haipeng Luo · Chen-Yu Wei · Mengxiao Zhang · Xiaojin Zhang -
2021 Spotlight: Near-Optimal Model-Free Reinforcement Learning in Non-Stationary Episodic MDPs »
Weichao Mao · Kaiqing Zhang · Ruihao Zhu · David Simchi-Levi · Tamer Basar -
2021 Spotlight: Achieving Near Instance-Optimality and Minimax-Optimality in Stochastic and Adversarial Linear Bandits Simultaneously »
Chung-Wei Lee · Haipeng Luo · Chen-Yu Wei · Mengxiao Zhang · Xiaojin Zhang -
2021 Poster: Reinforcement Learning for Cost-Aware Markov Decision Processes »
Wesley A Suttle · Kaiqing Zhang · Zhuoran Yang · Ji Liu · David N Kraemer -
2021 Spotlight: Reinforcement Learning for Cost-Aware Markov Decision Processes »
Wesley A Suttle · Kaiqing Zhang · Zhuoran Yang · Ji Liu · David N Kraemer -
2020 Poster: Model-free Reinforcement Learning in Infinite-horizon Average-reward Markov Decision Processes »
Chen-Yu Wei · Mehdi Jafarnia · Haipeng Luo · Hiteshi Sharma · Rahul Jain -
2019 Poster: Bandit Multiclass Linear Classification: Efficient Algorithms for the Separable Case »
Alina Beygelzimer · David Pal · Balazs Szorenyi · Devanathan Thiruvenkatachari · Chen-Yu Wei · Chicheng Zhang -
2019 Oral: Bandit Multiclass Linear Classification: Efficient Algorithms for the Separable Case »
Alina Beygelzimer · David Pal · Balazs Szorenyi · Devanathan Thiruvenkatachari · Chen-Yu Wei · Chicheng Zhang -
2019 Poster: Beating Stochastic and Adversarial Semi-bandits Optimally and Simultaneously »
Julian Zimmert · Haipeng Luo · Chen-Yu Wei -
2019 Oral: Beating Stochastic and Adversarial Semi-bandits Optimally and Simultaneously »
Julian Zimmert · Haipeng Luo · Chen-Yu Wei -
2018 Poster: Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents »
Kaiqing Zhang · Zhuoran Yang · Han Liu · Tong Zhang · Tamer Basar -
2018 Oral: Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents »
Kaiqing Zhang · Zhuoran Yang · Han Liu · Tong Zhang · Tamer Basar