Timezone: »
We consider the fully decentralized multi-agent reinforcement learning (MARL) problem, where the agents are connected via a time-varying and possibly sparse communication network. Specifically, we assume that the reward functions of the agents might correspond to different tasks, and are only known to the corresponding agent. Moreover, each agent makes individual decisions based on both the information observed locally and the messages received from its neighbors over the network. To maximize the globally averaged return over the network, we propose two fully decentralized actor-critic algorithms, which are applicable to large-scale MARL problems in an online fashion. Convergence guarantees are provided when the value functions are approximated within the class of linear functions. Our work appears to be the first theoretical study of fully decentralized MARL algorithms for networked agents that use function approximation.
Author Information
Kaiqing Zhang (University of Illinois at Urbana-Champaign (UIUC))
Zhuoran Yang (Princeton University)
Han Liu (Northwestern)
Tong Zhang (Tecent AI Lab)

Tong Zhang is a professor of Computer Science and Mathematics at the Hong Kong University of Science and Technology. His research interests are machine learning, big data and their applications. He obtained a BA in Mathematics and Computer Science from Cornell University, and a PhD in Computer Science from Stanford University. Before joining HKUST, Tong Zhang was a professor at Rutgers University, and worked previously at IBM, Yahoo as research scientists, Baidu as the director of Big Data Lab, and Tencent as the founding director of AI Lab. Tong Zhang was an ASA fellow and IMS fellow, and has served as the chair or area-chair in major machine learning conferences such as NIPS, ICML, and COLT, and has served as associate editors in top machine learning journals such as PAMI, JMLR, and Machine Learning Journal.
Tamer Basar (University of Illinois at Urbana-Champaign)
Related Events (a corresponding poster, oral, or spotlight)
-
2018 Oral: Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents »
Fri. Jul 13th 03:40 -- 03:50 PM Room A1
More from the Same Authors
-
2021 : Derivative-Free Policy Optimization for Linear Risk-Sensitive and Robust Control Design: Implicit Regularization and Sample Complexity »
Kaiqing Zhang · Xiangyuan Zhang · Bin Hu · Tamer Basar -
2021 : Randomized Least Squares Policy Optimization »
Haque Ishfaq · Zhuoran Yang · Andrei Lupu · Viet Nguyen · Lewis Liu · Riashat Islam · Zhaoran Wang · Doina Precup -
2021 : Decentralized Q-Learning in Zero-sum Markov Games »
Kaiqing Zhang · David Leslie · Tamer Basar · Asuman Ozdaglar -
2021 : Is Pessimism Provably Efficient for Offline RL? »
Ying Jin · Zhuoran Yang · Zhaoran Wang -
2021 : Efficient Exploration by HyperDQN in Deep Reinforcement Learning »
Ziniu Li · Yingru Li · Hao Liang · Tong Zhang -
2023 Poster: Feature Programming for Time Series Prediction »
Alex Reneau · Jerry Yao-Chieh Hu · Ammar Gilani · Han Liu -
2023 Poster: Beyond Uniform Lipschitz Condition in Differentially Private Optimization »
Rudrajit Das · Satyen Kale · Zheng Xu · Tong Zhang · Sujay Sanghavi -
2023 Poster: What is Essential for Unseen Goal Generalization of Offline Goal-conditioned RL? »
Rui Yang · Yong LIN · Xiaoteng Ma · Hao Hu · Chongjie Zhang · Tong Zhang -
2023 Poster: On the Convergence of Federated Averaging with Cyclic Client Participation »
Yae Jee Cho · PRANAY SHARMA · Gauri Joshi · Zheng Xu · Satyen Kale · Tong Zhang -
2023 Poster: Generalized Polyak Step Size for First Order Optimization with Momentum »
Xiaoyu Wang · Mikael Johansson · Tong Zhang -
2023 Poster: Learning in POMDPs is Sample-Efficient with Hindsight Observability »
Jonathan Lee · Alekh Agarwal · Christoph Dann · Tong Zhang -
2023 Poster: Corruption-Robust Algorithms with Uncertainty Weighting for Nonlinear Contextual Bandits and MDPs »
Chenlu Ye · Wei Xiong · Quanquan Gu · Tong Zhang -
2023 Poster: Weakly Supervised Disentangled Generative Causal Representation Learning »
Xinwei Shen · Furui Liu · Hanze Dong · Qing Lian · Zhitang Chen · Tong Zhang -
2022 Poster: On Improving Model-Free Algorithms for Decentralized Multi-Agent Reinforcement Learning »
Weichao Mao · Lin Yang · Kaiqing Zhang · Tamer Basar -
2022 Poster: Bregman Proximal Langevin Monte Carlo via Bregman--Moreau Envelopes »
Tim Tsz-Kit Lau · Han Liu -
2022 Poster: A Self-Play Posterior Sampling Algorithm for Zero-Sum Markov Games »
Wei Xiong · Han Zhong · Chengshuai Shi · Cong Shen · Tong Zhang -
2022 Poster: Pessimistic Minimax Value Iteration: Provably Efficient Equilibrium Learning from Offline Datasets »
Han Zhong · Wei Xiong · Jiyuan Tan · Liwei Wang · Tong Zhang · Zhaoran Wang · Zhuoran Yang -
2022 Spotlight: Bregman Proximal Langevin Monte Carlo via Bregman--Moreau Envelopes »
Tim Tsz-Kit Lau · Han Liu -
2022 Spotlight: On Improving Model-Free Algorithms for Decentralized Multi-Agent Reinforcement Learning »
Weichao Mao · Lin Yang · Kaiqing Zhang · Tamer Basar -
2022 Spotlight: Pessimistic Minimax Value Iteration: Provably Efficient Equilibrium Learning from Offline Datasets »
Han Zhong · Wei Xiong · Jiyuan Tan · Liwei Wang · Tong Zhang · Zhaoran Wang · Zhuoran Yang -
2022 Spotlight: A Self-Play Posterior Sampling Algorithm for Zero-Sum Markov Games »
Wei Xiong · Han Zhong · Chengshuai Shi · Cong Shen · Tong Zhang -
2022 Poster: Benefits of Overparameterized Convolutional Residual Networks: Function Approximation under Smoothness Constraint »
Hao Liu · Minshuo Chen · Siawpeng Er · Wenjing Liao · Tong Zhang · Tuo Zhao -
2022 Spotlight: Benefits of Overparameterized Convolutional Residual Networks: Function Approximation under Smoothness Constraint »
Hao Liu · Minshuo Chen · Siawpeng Er · Wenjing Liao · Tong Zhang · Tuo Zhao -
2022 Poster: A Theoretical Analysis on Independence-driven Importance Weighting for Covariate-shift Generalization »
Renzhe Xu · Xingxuan Zhang · Zheyan Shen · Tong Zhang · Peng Cui -
2022 Poster: Sparse Invariant Risk Minimization »
Xiao Zhou · Yong LIN · Weizhong Zhang · Tong Zhang -
2022 Poster: Model Agnostic Sample Reweighting for Out-of-Distribution Learning »
Xiao Zhou · Yong LIN · Renjie Pi · Weizhong Zhang · Renzhe Xu · Peng Cui · Tong Zhang -
2022 Poster: Probabilistic Bilevel Coreset Selection »
Xiao Zhou · Renjie Pi · Weizhong Zhang · Yong LIN · Zonghao Chen · Tong Zhang -
2022 Spotlight: A Theoretical Analysis on Independence-driven Importance Weighting for Covariate-shift Generalization »
Renzhe Xu · Xingxuan Zhang · Zheyan Shen · Tong Zhang · Peng Cui -
2022 Spotlight: Probabilistic Bilevel Coreset Selection »
Xiao Zhou · Renjie Pi · Weizhong Zhang · Yong LIN · Zonghao Chen · Tong Zhang -
2022 Spotlight: Model Agnostic Sample Reweighting for Out-of-Distribution Learning »
Xiao Zhou · Yong LIN · Renjie Pi · Weizhong Zhang · Renzhe Xu · Peng Cui · Tong Zhang -
2022 Spotlight: Sparse Invariant Risk Minimization »
Xiao Zhou · Yong LIN · Weizhong Zhang · Tong Zhang -
2021 Poster: Near-Optimal Model-Free Reinforcement Learning in Non-Stationary Episodic MDPs »
Weichao Mao · Kaiqing Zhang · Ruihao Zhu · David Simchi-Levi · Tamer Basar -
2021 Poster: Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality »
Tengyu Xu · Zhuoran Yang · Zhaoran Wang · Yingbin LIANG -
2021 Poster: Randomized Exploration in Reinforcement Learning with General Value Function Approximation »
Haque Ishfaq · Qiwen Cui · Viet Nguyen · Alex Ayoub · Zhuoran Yang · Zhaoran Wang · Doina Precup · Lin Yang -
2021 Spotlight: Near-Optimal Model-Free Reinforcement Learning in Non-Stationary Episodic MDPs »
Weichao Mao · Kaiqing Zhang · Ruihao Zhu · David Simchi-Levi · Tamer Basar -
2021 Spotlight: Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality »
Tengyu Xu · Zhuoran Yang · Zhaoran Wang · Yingbin LIANG -
2021 Spotlight: Randomized Exploration in Reinforcement Learning with General Value Function Approximation »
Haque Ishfaq · Qiwen Cui · Viet Nguyen · Alex Ayoub · Zhuoran Yang · Zhaoran Wang · Doina Precup · Lin Yang -
2021 Poster: Provably Efficient Fictitious Play Policy Optimization for Zero-Sum Markov Games with Structured Transitions »
Shuang Qiu · Xiaohan Wei · Jieping Ye · Zhaoran Wang · Zhuoran Yang -
2021 Poster: On Reward-Free RL with Kernel and Neural Function Approximations: Single-Agent MDP and Markov Game »
Shuang Qiu · Jieping Ye · Zhaoran Wang · Zhuoran Yang -
2021 Poster: Global Convergence of Policy Gradient for Linear-Quadratic Mean-Field Control/Game in Continuous Time »
Weichen Wang · Jiequn Han · Zhuoran Yang · Zhaoran Wang -
2021 Spotlight: Global Convergence of Policy Gradient for Linear-Quadratic Mean-Field Control/Game in Continuous Time »
Weichen Wang · Jiequn Han · Zhuoran Yang · Zhaoran Wang -
2021 Oral: On Reward-Free RL with Kernel and Neural Function Approximations: Single-Agent MDP and Markov Game »
Shuang Qiu · Jieping Ye · Zhaoran Wang · Zhuoran Yang -
2021 Oral: Provably Efficient Fictitious Play Policy Optimization for Zero-Sum Markov Games with Structured Transitions »
Shuang Qiu · Xiaohan Wei · Jieping Ye · Zhaoran Wang · Zhuoran Yang -
2021 Poster: Learning While Playing in Mean-Field Games: Convergence and Optimality »
Qiaomin Xie · Zhuoran Yang · Zhaoran Wang · Andreea Minca -
2021 Poster: Is Pessimism Provably Efficient for Offline RL? »
Ying Jin · Zhuoran Yang · Zhaoran Wang -
2021 Spotlight: Is Pessimism Provably Efficient for Offline RL? »
Ying Jin · Zhuoran Yang · Zhaoran Wang -
2021 Spotlight: Learning While Playing in Mean-Field Games: Convergence and Optimality »
Qiaomin Xie · Zhuoran Yang · Zhaoran Wang · Andreea Minca -
2021 Town Hall: Town Hall »
John Langford · Marina Meila · Tong Zhang · Le Song · Stefanie Jegelka · Csaba Szepesvari -
2021 Poster: Risk-Sensitive Reinforcement Learning with Function Approximation: A Debiasing Approach »
Yingjie Fei · Zhuoran Yang · Zhaoran Wang -
2021 Poster: Reinforcement Learning for Cost-Aware Markov Decision Processes »
Wesley A Suttle · Kaiqing Zhang · Zhuoran Yang · Ji Liu · David N Kraemer -
2021 Spotlight: Reinforcement Learning for Cost-Aware Markov Decision Processes »
Wesley A Suttle · Kaiqing Zhang · Zhuoran Yang · Ji Liu · David N Kraemer -
2021 Oral: Risk-Sensitive Reinforcement Learning with Function Approximation: A Debiasing Approach »
Yingjie Fei · Zhuoran Yang · Zhaoran Wang -
2020 Poster: Breaking the Curse of Many Agents: Provable Mean Embedding Q-Iteration for Mean-Field Reinforcement Learning »
Lingxiao Wang · Zhuoran Yang · Zhaoran Wang -
2020 Poster: Robust One-Bit Recovery via ReLU Generative Networks: Near-Optimal Statistical Rate and Global Landscape Analysis »
Shuang Qiu · Xiaohan Wei · Zhuoran Yang -
2020 Poster: Generative Adversarial Imitation Learning with Neural Network Parameterization: Global Optimality and Convergence Rate »
Yufeng Zhang · Qi Cai · Zhuoran Yang · Zhaoran Wang -
2020 Poster: Guided Learning of Nonconvex Models through Successive Functional Gradient Optimization »
Rie Johnson · Tong Zhang -
2020 Poster: Provably Efficient Exploration in Policy Optimization »
Qi Cai · Zhuoran Yang · Chi Jin · Zhaoran Wang -
2020 Poster: On the Global Optimality of Model-Agnostic Meta-Learning »
Lingxiao Wang · Qi Cai · Zhuoran Yang · Zhaoran Wang -
2020 Poster: Semiparametric Nonlinear Bipartite Graph Representation Learning with Provable Guarantees »
Sen Na · Yuwei Luo · Zhuoran Yang · Zhaoran Wang · Mladen Kolar -
2019 Poster: $\texttt{DoubleSqueeze}$: Parallel Stochastic Gradient Descent with Double-pass Error-Compensated Compression »
Hanlin Tang · Chen Yu · Xiangru Lian · Tong Zhang · Ji Liu -
2019 Oral: $\texttt{DoubleSqueeze}$: Parallel Stochastic Gradient Descent with Double-pass Error-Compensated Compression »
Hanlin Tang · Chen Yu · Xiangru Lian · Tong Zhang · Ji Liu -
2019 Poster: On the statistical rate of nonlinear recovery in generative models with heavy-tailed data »
Xiaohan Wei · Zhuoran Yang · Zhaoran Wang -
2019 Oral: On the statistical rate of nonlinear recovery in generative models with heavy-tailed data »
Xiaohan Wei · Zhuoran Yang · Zhaoran Wang -
2019 Poster: Grid-Wise Control for Multi-Agent Reinforcement Learning in Video Game AI »
Lei Han · Peng Sun · Yali Du · Jiechao Xiong · Qing Wang · Xinghai Sun · Han Liu · Tong Zhang -
2019 Oral: Grid-Wise Control for Multi-Agent Reinforcement Learning in Video Game AI »
Lei Han · Peng Sun · Yali Du · Jiechao Xiong · Qing Wang · Xinghai Sun · Han Liu · Tong Zhang -
2019 Tutorial: Causal Inference and Stable Learning »
Tong Zhang · Peng Cui -
2018 Poster: An Algorithmic Framework of Variable Metric Over-Relaxed Hybrid Proximal Extra-Gradient Method »
Li Shen · Peng Sun · Yitong Wang · Wei Liu · Tong Zhang -
2018 Poster: Candidates vs. Noises Estimation for Large Multi-Class Classification Problem »
Lei Han · Yiheng Huang · Tong Zhang -
2018 Oral: An Algorithmic Framework of Variable Metric Over-Relaxed Hybrid Proximal Extra-Gradient Method »
Li Shen · Peng Sun · Yitong Wang · Wei Liu · Tong Zhang -
2018 Oral: Candidates vs. Noises Estimation for Large Multi-Class Classification Problem »
Lei Han · Yiheng Huang · Tong Zhang -
2018 Poster: Graphical Nonconvex Optimization via an Adaptive Convex Relaxation »
Qiang Sun · Kean Ming Tan · Han Liu · Tong Zhang -
2018 Poster: Composite Functional Gradient Learning of Generative Adversarial Models »
Rie Johnson · Tong Zhang -
2018 Poster: Error Compensated Quantized SGD and its Applications to Large-scale Distributed Optimization »
Jiaxiang Wu · Weidong Huang · Junzhou Huang · Tong Zhang -
2018 Oral: Graphical Nonconvex Optimization via an Adaptive Convex Relaxation »
Qiang Sun · Kean Ming Tan · Han Liu · Tong Zhang -
2018 Oral: Composite Functional Gradient Learning of Generative Adversarial Models »
Rie Johnson · Tong Zhang -
2018 Oral: Error Compensated Quantized SGD and its Applications to Large-scale Distributed Optimization »
Jiaxiang Wu · Weidong Huang · Junzhou Huang · Tong Zhang -
2018 Poster: Safe Element Screening for Submodular Function Minimization »
Weizhong Zhang · Bin Hong · Lin Ma · Wei Liu · Tong Zhang -
2018 Poster: End-to-end Active Object Tracking via Reinforcement Learning »
Wenhan Luo · Peng Sun · Fangwei Zhong · Wei Liu · Tong Zhang · Yizhou Wang -
2018 Poster: Feedback-Based Tree Search for Reinforcement Learning »
Daniel Jiang · Emmanuel Ekwedike · Han Liu -
2018 Oral: Feedback-Based Tree Search for Reinforcement Learning »
Daniel Jiang · Emmanuel Ekwedike · Han Liu -
2018 Oral: End-to-end Active Object Tracking via Reinforcement Learning »
Wenhan Luo · Peng Sun · Fangwei Zhong · Wei Liu · Tong Zhang · Yizhou Wang -
2018 Oral: Safe Element Screening for Submodular Function Minimization »
Weizhong Zhang · Bin Hong · Lin Ma · Wei Liu · Tong Zhang -
2017 Poster: Projection-free Distributed Online Learning in Networks »
Wenpeng Zhang · Peilin Zhao · Wenwu Zhu · Steven Hoi · Tong Zhang -
2017 Poster: High-dimensional Non-Gaussian Single Index Models via Thresholded Score Function Estimation »
Zhuoran Yang · Krishnakumar Balasubramanian · Han Liu -
2017 Talk: Projection-free Distributed Online Learning in Networks »
Wenpeng Zhang · Peilin Zhao · Wenwu Zhu · Steven Hoi · Tong Zhang -
2017 Talk: High-dimensional Non-Gaussian Single Index Models via Thresholded Score Function Estimation »
Zhuoran Yang · Krishnakumar Balasubramanian · Han Liu -
2017 Poster: Efficient Distributed Learning with Sparsity »
Jialei Wang · Mladen Kolar · Nati Srebro · Tong Zhang -
2017 Talk: Efficient Distributed Learning with Sparsity »
Jialei Wang · Mladen Kolar · Nati Srebro · Tong Zhang