Timezone: »
Poster
The Intrinsic Robustness of Stochastic Bandits to Strategic Manipulation
Zhe Feng · David Parkes · Haifeng Xu
Thu Jul 16 07:00 AM -- 07:45 AM & Thu Jul 16 06:00 PM -- 06:45 PM (PDT) @
Motivated by economic applications such as recommender systems, we study the behavior of stochastic bandits algorithms under \emph{strategic behavior} conducted by rational actors, i.e., the arms. Each arm is a \emph{self-interested} strategic player who can modify its own reward whenever pulled, subject to a cross-period budget constraint, in order to maximize its own expected number of times of being pulled. We analyze the robustness of three popular bandit algorithms: UCB, $\varepsilon$-Greedy, and Thompson Sampling. We prove that all three algorithms achieve a regret upper bound $\mathcal{O}(\max \{ B, K\ln T\})$ where $B$ is the total budget across arms, $K$ is the total number of arms and $T$ is the running time of the algorithms. This regret guarantee holds for \emph{arbitrary adaptive} manipulation strategy of arms. Our second set of main results shows that this regret bound is \emph{tight}--- in fact, for UCB, it is tight even when we restrict the arms' manipulation strategies to form a \emph{Nash equilibrium}. We do so by characterizing the Nash equilibrium of the game induced by arms' strategic manipulations and show a regret lower bound of $\Omega(\max \{ B, K\ln T\})$ at the equilibrium.
Author Information
Zhe Feng (Harvard University)
David Parkes (Harvard University)
Haifeng Xu (University of Virginia)
More from the Same Authors
-
2020 : Contributed Talk: From Predictions to Decisions: Using Lookahead Regularization »
Nir Rosenfeld · Sai Srivatsa Ravindranath · David Parkes -
2023 Poster: Oracles and Followers: Stackelberg Equilibria in Deep Multi-Agent Reinforcement Learning »
Matthias Gerstgrasser · David Parkes -
2022 Poster: When Are Linear Stochastic Bandits Attackable? »
Huazheng Wang · Haifeng Xu · Hongning Wang -
2022 Poster: Learning from a Learning User for Optimal Recommendations »
Fan Yao · Chuanhao Li · Denis Nekipelov · Hongning Wang · Haifeng Xu -
2022 Poster: Selling Data To a Machine Learner: Pricing via Costly Signaling »
Junjie Chen · Minming Li · Haifeng Xu -
2022 Spotlight: Learning from a Learning User for Optimal Recommendations »
Fan Yao · Chuanhao Li · Denis Nekipelov · Hongning Wang · Haifeng Xu -
2022 Spotlight: When Are Linear Stochastic Bandits Attackable? »
Huazheng Wang · Haifeng Xu · Hongning Wang -
2022 Spotlight: Selling Data To a Machine Learner: Pricing via Costly Signaling »
Junjie Chen · Minming Li · Haifeng Xu -
2021 Poster: Learning Representations by Humans, for Humans »
Sophie Hilgard · Nir Rosenfeld · Mahzarin Banaji · Jack Cao · David Parkes -
2021 Spotlight: Learning Representations by Humans, for Humans »
Sophie Hilgard · Nir Rosenfeld · Mahzarin Banaji · Jack Cao · David Parkes -
2021 Poster: Diffusion Source Identification on Networks with Statistical Confidence »
Quinlan Dawkins · Tianxi Li · Haifeng Xu -
2021 Spotlight: Diffusion Source Identification on Networks with Statistical Confidence »
Quinlan Dawkins · Tianxi Li · Haifeng Xu -
2021 Poster: PAC-Learning for Strategic Classification »
Ravi Sundaram · Anil Vullikanti · Haifeng Xu · Fan Yao -
2021 Poster: Reserve Price Optimization for First Price Auctions in Display Advertising »
Zhe Feng · Sébastien Lahaie · Jon Schneider · Jinchao Ye -
2021 Oral: PAC-Learning for Strategic Classification »
Ravi Sundaram · Anil Vullikanti · Haifeng Xu · Fan Yao -
2021 Oral: Reserve Price Optimization for First Price Auctions in Display Advertising »
Zhe Feng · Sébastien Lahaie · Jon Schneider · Jinchao Ye -
2019 Poster: Fairness without Harm: Decoupled Classifiers with Preference Guarantees »
Berk Ustun · Yang Liu · David Parkes -
2019 Oral: Fairness without Harm: Decoupled Classifiers with Preference Guarantees »
Berk Ustun · Yang Liu · David Parkes -
2019 Poster: Learning to Collaborate in Markov Decision Processes »
Goran Radanovic · Rati Devidze · David Parkes · Adish Singla -
2019 Poster: Optimal Auctions through Deep Learning »
Paul Duetting · Zhe Feng · Harikrishna Narasimhan · David Parkes · Sai Srivatsa Ravindranath -
2019 Oral: Learning to Collaborate in Markov Decision Processes »
Goran Radanovic · Rati Devidze · David Parkes · Adish Singla -
2019 Oral: Optimal Auctions through Deep Learning »
Paul Duetting · Zhe Feng · Harikrishna Narasimhan · David Parkes · Sai Srivatsa Ravindranath