Skip to yearly menu bar Skip to main content

Workshop: Workshop on Reinforcement Learning Theory

Invited Speaker: Qiaomin Xie: Reinforcement Learning for Zero-Sum Markov Games Using Function Approximation and Correlated Equilibrium

Qiaomin Xie


We consider on-policy reinforcement learning for two-player zero-sum Markov games with simultaneous moves, which generalizes both matrix games and Markov decision processes. To ensure exploration, our algorithm needs to construct upper and lower confidence bounds for both players in each round. We show that this can be done by finding the Coarse Correlated Equilibrium of an appropriate general-sum matrix game, which can be solved in polynomial time. The algorithm applies to the offline setting where one aims to find the Nash Equilibria of the Markov game, and the online setting where one plays against an arbitrary opponent and aims to minimize regret. For both settings, we provide sample complexity and regret guarantees.

Our results generalize to Markov games with continuous and unbounded state spaces. Using linear function approximation, we develop an optimistic version of least-squares minimax value iteration, for which we provide performance guarantees that only depend on the linear dimension. This setting highlights the analytical challenges that arise due to the non-Lipschitzness of CCEs of general-sum games.

Based on .