Skip to yearly menu bar Skip to main content


Spotlight

Sample and Communication-Efficient Decentralized Actor-Critic Algorithms with Finite-Time Analysis

Ziyi Chen · Yi Zhou · Rong-Rong Chen · Shaofeng Zou

Room 318 - 320

Abstract: Actor-critic (AC) algorithms have been widely used in decentralized multi-agent systems to learn the optimal joint control policy. However, existing decentralized AC algorithms either need to share agents' sensitive information or lack communication-efficiency. In this work, we develop decentralized AC and natural AC (NAC) algorithms that avoid sharing agents' local information and are sample and communication-efficient. In both algorithms, agents share only noisy rewards and use mini-batch local policy gradient updates to ensure high sample and communication efficiency. Particularly for decentralized NAC, we develop a decentralized Markovian SGD algorithm with an adaptive mini-batch size to efficiently compute the natural policy gradient. Under Markovian sampling and linear function approximation, we prove that the proposed decentralized AC and NAC algorithms achieve the state-of-the-art sample complexities O(ϵ2lnϵ1) and O(ϵ3lnϵ1), respectively, and achieve an improved communication complexity O(ϵ1lnϵ1). Numerical experiments demonstrate that the proposed algorithms achieve lower sample and communication complexities than the existing decentralized AC algorithms.

Chat is not available.