Timezone: »

Competing for Shareable Arms in Multi-Player Multi-Armed Bandits
Renzhe Xu · Haotian Wang · Xingxuan Zhang · Bo Li · Peng Cui

Thu Jul 27 04:30 PM -- 06:00 PM (PDT) @ Exhibit Hall 1 #719
Event URL: https://github.com/windxrz/SMAA »

Competitions for shareable and limited resources have long been studied with strategic agents. In reality, agents often have to learn and maximize the rewards of the resources at the same time. To design an individualized competing policy, we model the competition between agents in a novel multi-player multi-armed bandit (MPMAB) setting where players are selfish and aim to maximize their own rewards. In addition, when several players pull the same arm, we assume that these players averagely share the arms' rewards by expectation. Under this setting, we first analyze the Nash equilibrium when arms' rewards are known. Subsequently, we propose a novel Selfish MPMAB with Averaging Allocation (SMAA) approach based on the equilibrium. We theoretically demonstrate that SMAA could achieve a good regret guarantee for each player when all players follow the algorithm. Additionally, we establish that no single selfish player can significantly increase their rewards through deviation, nor can they detrimentally affect other players' rewards without incurring substantial losses for themselves. We finally validate the effectiveness of the method in extensive synthetic experiments.

Author Information

Renzhe Xu (Tsinghua University)
Haotian Wang
Xingxuan Zhang (Tsinghua University)
Bo Li (Tsinghua University)
Peng Cui (Tsinghua University)
Peng Cui

Peng Cui is an Associate Professor in Tsinghua University. He got his PhD degree from Tsinghua University in 2010. His research interests include causal inference and stable learning, network representation learning, and human behavioral modeling. He has published more than 100 papers in prestigious conferences and journals in data mining and multimedia. His recent research won the IEEE Multimedia Best Department Paper Award, SIGKDD 2016 Best Paper Finalist, ICDM 2015 Best Student Paper Award, SIGKDD 2014 Best Paper Finalist, IEEE ICME 2014 Best Paper Award, ACM MM12 Grand Challenge Multimodal Award, and MMM13 Best Paper Award. He is the Associate Editors of IEEE TKDE, IEEE TBD, ACM TIST, and ACM TOMM etc. He has served as program co-chair and area chair of several major machine learning and artificial intelligence conferences, such as IJCAI, AAAI, ACM CIKM, ACM Multimedia etc.

More from the Same Authors