In this paper, we propose and study opportunistic bandits - a new variant of bandits where the regret of pulling a suboptimal arm varies under different environmental conditions, such as network load or produce price. When the load/price is low, so is the cost/regret of pulling a suboptimal arm (e.g., trying a suboptimal network configuration). Therefore, intuitively, we could explore more when the load/price is low and exploit more when the load/price is high. Inspired by this intuition, we propose an Adaptive Upper-Confidence-Bound (AdaUCB) algorithm to adaptively balance the exploration-exploitation tradeoff for opportunistic bandits. We prove that AdaUCB achieves O(log T) regret with a smaller coefficient than the traditional UCB algorithm. Furthermore, AdaUCB achieves O(1) regret with respect to T if the exploration cost is zero when the load level is below a certain threshold. Last, based on both synthetic data and real-world traces, experimental results show that AdaUCB significantly outperforms other bandit algorithms, such as UCB and TS (Thompson Sampling), under large load/price fluctuations.
Huasen Wu (Twitter)
Huasen Wu is currently a software engineer at the Recommendation Team, Twitter Inc. He received his B.S. and Ph.D. degrees from the School of Electronic and Information Engineering, Beihang University, Beijing, in 2007 and 2014, respectively. He was a Postdoctoral Researcher working with Prof. Xin Liu at the Department of Computer Science, University of California, Davis. From December 2010 to January 2012, he was a visiting student at UC Davis, from May 2014 to August 2014, he was a visiting scholar at University of Illinois at Urbana-Champaign (UIUC), and from October 2012 to January 2014, he worked as a research intern at Wireless and Networking Group, Microsoft Research Asia (MSRA).
Xueying Guo (University of California Davis)
Xin Liu (University of California, Davis)
Related Events (a corresponding poster, oral, or spotlight)
2018 Oral: Adaptive Exploration-Exploitation Tradeoff for Opportunistic Bandits »
Fri Jul 13th 08:10 -- 08:20 AM Room A5