Timezone: »

Cooperative Multi-Agent Bandits with Heavy Tails
Abhimanyu Dubey · Alex `Sandy' Pentland

Wed Jul 15 09:00 AM -- 09:45 AM & Wed Jul 15 08:00 PM -- 08:45 PM (PDT) @

We study the heavy-tailed stochastic bandit problem in the cooperative multi-agent setting, where a group of agents interact with a common bandit problem, while communicating on a network with delays. Existing algorithms for the stochastic bandit in this setting utilize confidence intervals arising from an averaging-based communication protocol known as running consensus, that does not lend itself to robust estimation for heavy-tailed settings. We propose MP-UCB, a decentralized multi-agent algorithm for the cooperative stochastic bandit that incorporates robust estimation with a message-passing protocol. We prove optimal regret bounds for MP-UCB for several problem settings, and also demonstrate its superiority to existing methods. Furthermore, we establish the first lower bounds for the cooperative bandit problem, in addition to providing efficient algorithms for robust bandit estimation of location.

Author Information

Abhimanyu Dubey (Massachusetts Institute of Technology)

I am a PhD student in the Human Dynamics group at MIT, advised by Professor Alex Pentland. My research interests are in robust and cooperative machine learning, including problems in multi-agent decision-making and transfer learning. Prior to this, I received a master's degree in Computer Science and bachelor's degree in Electrical Engineering at IIT Delhi, where I was advised by Professor Sumeet Agarwal. I've also spent time as a research intern at Facebook AI, and was a post-baccalaureate fellow at the Department of Economics at Harvard, under Professor Ed Glaeser. My research has been supported by a Snap Research Scholarship (2019) and an Emerging Worlds Fellowship (2017).

Alex `Sandy' Pentland (MIT)

More from the Same Authors