Poster
Tue Aug 08 01:30 AM -- 05:00 AM (PDT) @ Gallery #9
Efficient Online Bandit Multiclass Learning with O(sqrt{T}) Regret
In
Posters Tue
[
PDF]
[
Summary/Notes]
We present an efficient second-order algorithm with tilde{O}(1/eta sqrt{T}) regret for the bandit online multiclass problem. The regret bound holds simultaneously with respect to a family of loss functions parameterized by eta, ranging from hinge loss (eta=0) to squared hinge loss (eta=1). This provides a solution to the open problem of (Abernethy, J. and Rakhlin, A. An efficient bandit algorithm for sqrt{T}-regret in online multiclass prediction? In COLT, 2009). We test our algorithm experimentally, showing that it performs favorably against earlier algorithms.