Timezone: »

Achieving Near Instance-Optimality and Minimax-Optimality in Stochastic and Adversarial Linear Bandits Simultaneously
Chung-Wei Lee · Haipeng Luo · Chen-Yu Wei · Mengxiao Zhang · Xiaojin Zhang

Wed Jul 21 09:00 PM -- 11:00 PM (PDT) @

In this work, we develop linear bandit algorithms that automatically adapt to different environments. By plugging a novel loss estimator into the optimization problem that characterizes the instance-optimal strategy, our first algorithm not only achieves nearly instance-optimal regret in stochastic environments, but also works in corrupted environments with additional regret being the amount of corruption, while the state-of-the-art (Li et al., 2019) achieves neither instance-optimality nor the optimal dependence on the corruption amount. Moreover, by equipping this algorithm with an adversarial component and carefully-designed testings, our second algorithm additionally enjoys minimax-optimal regret in completely adversarial environments, which is the first of this kind to our knowledge. Finally, all our guarantees hold with high probability, while existing instance-optimal guarantees only hold in expectation.

Author Information

Chung-Wei Lee (University of Southern California)
Haipeng Luo (University of Southern California)
Chen-Yu Wei (University of Southern California)
Mengxiao Zhang (University of Southern California)
Xiaojin Zhang (The Chinese University of Hong Kong)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors