Talk

Efficient Online Bandit Multiclass Learning with O(sqrt{T}) Regret

Alina Beygelzimer · Francesco Orabona · Chicheng Zhang

2017 Talk

[ PDF] [

Summary/Notes] [ Video]

Abstract

We present an efficient second-order algorithm with tilde{O}(1/eta sqrt{T}) regret for the bandit online multiclass problem. The regret bound holds simultaneously with respect to a family of loss functions parameterized by eta, ranging from hinge loss (eta=0) to squared hinge loss (eta=1). This provides a solution to the open problem of (Abernethy, J. and Rakhlin, A. An efficient bandit algorithm for sqrt{T}-regret in online multiclass prediction? In COLT, 2009). We test our algorithm experimentally, showing that it performs favorably against earlier algorithms.

Chat is not available.