Timezone: »

Learning While Playing in Mean-Field Games: Convergence and Optimality
Qiaomin Xie · Zhuoran Yang · Zhaoran Wang · Andreea Minca

Tue Jul 20 09:00 PM -- 11:00 PM (PDT) @

We study reinforcement learning in mean-field games. To achieve the Nash equilibrium, which consists of a policy and a mean-field state, existing algorithms require obtaining the optimal policy while fixing any mean-field state. In practice, however, the policy and the mean-field state evolve simultaneously, as each agent is learning while playing. To bridge such a gap, we propose a fictitious play algorithm, which alternatively updates the policy (learning) and the mean-field state (playing) by one step of policy optimization and gradient descent, respectively. Despite the nonstationarity induced by such an alternating scheme, we prove that the proposed algorithm converges to the Nash equilibrium with an explicit convergence rate. To the best of our knowledge, it is the first provably efficient algorithm that achieves learning while playing via alternating updates.

Author Information

Qiaomin Xie (Cornell University)
Zhuoran Yang (Princeton University)
Zhaoran Wang (Northwestern University)
Andreea Minca (Cornell University)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors