Timezone: »
Recent years have witnessed the success of multi-agent reinforcement learning, which has motivated new research directions for mean-field control (MFC) and mean-field game (MFG), as the multi-agent system can be well approximated by a mean-field problem when the number of agents grows to be very large. In this paper, we study the policy gradient (PG) method for the linear-quadratic mean-field control and game, where we assume each agent has identical linear state transitions and quadratic cost functions. While most recent works on policy gradient for MFC and MFG are based on discrete-time models, we focus on a continuous-time model where some of our analyzing techniques could be valuable to the interested readers. For both the MFC and the MFG, we provide PG update and show that it converges to the optimal solution at a linear rate, which is verified by a synthetic simulation. For the MFG, we also provide sufficient conditions for the existence and uniqueness of the Nash equilibrium.
Author Information
Weichen Wang (The University of Hong Kong)
Jiequn Han (Princeton University)
Zhuoran Yang (Princeton University)
Zhaoran Wang (Northwestern)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Spotlight: Global Convergence of Policy Gradient for Linear-Quadratic Mean-Field Control/Game in Continuous Time »
Wed. Jul 21st 02:40 -- 02:45 PM Room
More from the Same Authors
-
2021 : Randomized Least Squares Policy Optimization »
Haque Ishfaq · Zhuoran Yang · Andrei Lupu · Viet Nguyen · Lewis Liu · Riashat Islam · Zhaoran Wang · Doina Precup -
2021 : Is Pessimism Provably Efficient for Offline RL? »
Ying Jin · Zhuoran Yang · Zhaoran Wang -
2021 Poster: Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality »
Tengyu Xu · Zhuoran Yang · Zhaoran Wang · Yingbin LIANG -
2021 Poster: Randomized Exploration in Reinforcement Learning with General Value Function Approximation »
Haque Ishfaq · Qiwen Cui · Viet Nguyen · Alex Ayoub · Zhuoran Yang · Zhaoran Wang · Doina Precup · Lin Yang -
2021 Spotlight: Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality »
Tengyu Xu · Zhuoran Yang · Zhaoran Wang · Yingbin LIANG -
2021 Spotlight: Randomized Exploration in Reinforcement Learning with General Value Function Approximation »
Haque Ishfaq · Qiwen Cui · Viet Nguyen · Alex Ayoub · Zhuoran Yang · Zhaoran Wang · Doina Precup · Lin Yang -
2021 Poster: Provably Efficient Fictitious Play Policy Optimization for Zero-Sum Markov Games with Structured Transitions »
Shuang Qiu · Xiaohan Wei · Jieping Ye · Zhaoran Wang · Zhuoran Yang -
2021 Poster: On Reward-Free RL with Kernel and Neural Function Approximations: Single-Agent MDP and Markov Game »
Shuang Qiu · Jieping Ye · Zhaoran Wang · Zhuoran Yang -
2021 Oral: On Reward-Free RL with Kernel and Neural Function Approximations: Single-Agent MDP and Markov Game »
Shuang Qiu · Jieping Ye · Zhaoran Wang · Zhuoran Yang -
2021 Oral: Provably Efficient Fictitious Play Policy Optimization for Zero-Sum Markov Games with Structured Transitions »
Shuang Qiu · Xiaohan Wei · Jieping Ye · Zhaoran Wang · Zhuoran Yang -
2021 Poster: Learning While Playing in Mean-Field Games: Convergence and Optimality »
Qiaomin Xie · Zhuoran Yang · Zhaoran Wang · Andreea Minca -
2021 Poster: Is Pessimism Provably Efficient for Offline RL? »
Ying Jin · Zhuoran Yang · Zhaoran Wang -
2021 Spotlight: Is Pessimism Provably Efficient for Offline RL? »
Ying Jin · Zhuoran Yang · Zhaoran Wang -
2021 Spotlight: Learning While Playing in Mean-Field Games: Convergence and Optimality »
Qiaomin Xie · Zhuoran Yang · Zhaoran Wang · Andreea Minca -
2021 Poster: Risk-Sensitive Reinforcement Learning with Function Approximation: A Debiasing Approach »
Yingjie Fei · Zhuoran Yang · Zhaoran Wang -
2021 Poster: Reinforcement Learning for Cost-Aware Markov Decision Processes »
Wesley A Suttle · Kaiqing Zhang · Zhuoran Yang · Ji Liu · David N Kraemer -
2021 Spotlight: Reinforcement Learning for Cost-Aware Markov Decision Processes »
Wesley A Suttle · Kaiqing Zhang · Zhuoran Yang · Ji Liu · David N Kraemer -
2021 Oral: Risk-Sensitive Reinforcement Learning with Function Approximation: A Debiasing Approach »
Yingjie Fei · Zhuoran Yang · Zhaoran Wang -
2020 Poster: Computational and Statistical Tradeoffs in Inferring Combinatorial Structures of Ising Model »
Ying Jin · Zhaoran Wang · Junwei Lu -
2020 Poster: Breaking the Curse of Many Agents: Provable Mean Embedding Q-Iteration for Mean-Field Reinforcement Learning »
Lingxiao Wang · Zhuoran Yang · Zhaoran Wang -
2020 Poster: Deep Reinforcement Learning with Smooth Policy »
Qianli Shen · Yan Li · Haoming Jiang · Zhaoran Wang · Tuo Zhao -
2020 Poster: Robust One-Bit Recovery via ReLU Generative Networks: Near-Optimal Statistical Rate and Global Landscape Analysis »
Shuang Qiu · Xiaohan Wei · Zhuoran Yang -
2020 Poster: Generative Adversarial Imitation Learning with Neural Network Parameterization: Global Optimality and Convergence Rate »
Yufeng Zhang · Qi Cai · Zhuoran Yang · Zhaoran Wang -
2020 Poster: Provably Efficient Exploration in Policy Optimization »
Qi Cai · Zhuoran Yang · Chi Jin · Zhaoran Wang -
2020 Poster: On the Global Optimality of Model-Agnostic Meta-Learning »
Lingxiao Wang · Qi Cai · Zhuoran Yang · Zhaoran Wang -
2020 Poster: Semiparametric Nonlinear Bipartite Graph Representation Learning with Provable Guarantees »
Sen Na · Yuwei Luo · Zhuoran Yang · Zhaoran Wang · Mladen Kolar -
2019 Poster: On the statistical rate of nonlinear recovery in generative models with heavy-tailed data »
Xiaohan Wei · Zhuoran Yang · Zhaoran Wang -
2019 Oral: On the statistical rate of nonlinear recovery in generative models with heavy-tailed data »
Xiaohan Wei · Zhuoran Yang · Zhaoran Wang -
2018 Poster: Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents »
Kaiqing Zhang · Zhuoran Yang · Han Liu · Tong Zhang · Tamer Basar -
2018 Oral: Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents »
Kaiqing Zhang · Zhuoran Yang · Han Liu · Tong Zhang · Tamer Basar -
2017 Poster: High-dimensional Non-Gaussian Single Index Models via Thresholded Score Function Estimation »
Zhuoran Yang · Krishnakumar Balasubramanian · Han Liu -
2017 Talk: High-dimensional Non-Gaussian Single Index Models via Thresholded Score Function Estimation »
Zhuoran Yang · Krishnakumar Balasubramanian · Han Liu