Skip to yearly menu bar Skip to main content


Oral
in
Workshop: 2nd ICML Workshop on New Frontiers in Adversarial Machine Learning

Near Optimal Adversarial Attack on UCB Bandits

Keywords: [ Bandit Algorithms ] [ Adversarial Attacks ]


Abstract: I study a stochastic multi-arm bandit problem where rewards are subject to adversarial corruption. I propose a novel attack strategy that manipulates a learner employing the upper-confidence-bound (UCB) algorithm into pulling some non-optimal target arm To(T) times with a cumulative cost that scales as O^(logT), where T is the number of rounds. I also prove the first lower bound on the cumulative attack cost. The lower bound matches the upper bound up to O(loglogT) factors, showing the proposed attack strategy to be near optimal.

Chat is not available.