Poster
Efficient Online Bandit Multiclass Learning with O(sqrt{T}) Regret
Alina Beygelzimer · Francesco Orabona · Chicheng Zhang
Abstract:
We present an efficient second-order algorithm with tilde{O}(1/eta sqrt{T}) regret for the bandit online multiclass problem. The regret bound holds simultaneously with respect to a family of loss functions parameterized by eta, ranging from hinge loss (eta=0) to squared hinge loss (eta=1). This provides a solution to the open problem of (Abernethy, J. and Rakhlin, A. An efficient bandit algorithm for sqrt{T}-regret in online multiclass prediction? In COLT, 2009). We test our algorithm experimentally, showing that it performs favorably against earlier algorithms.
Chat is not available.