Skip to yearly menu bar Skip to main content


Oral

Modeling Others using Oneself in Multi-Agent Reinforcement Learning

Roberta Raileanu · Emily Denton · Arthur Szlam · Facebook Rob Fergus

Abstract:

We consider the multi-agent reinforcement learningsetting with imperfect information. The rewardfunction depends on the hidden goals ofboth agents, so the agents must infer the otherplayers’ goals from their observed behavior inorder to maximize their returns. We propose anew approach for learning in these domains: SelfOther-Modeling (SOM), in which an agent usesits own policy to predict the other agent’s actionsand update its belief of their hidden goal in an onlinemanner. We evaluate this approach on threedifferent tasks and show that the agents are ableto learn better policies using their estimate of theother players’ goals, in both cooperative and competitivesettings.

Chat is not available.