Improved Bounds on Minimax Regret under Logarithmic Loss via Self-Concordance

Blair Bilodeau · Dylan Foster · Daniel Roy

Keywords: [ Statistical Learning Theory ] [ Online Learning / Bandits ] [ Online Learning, Active Learning, and Bandits ]

Abstract: We consider the classical problem of sequential probability assignment under logarithmic loss while competing against an arbitrary, potentially nonparametric class of experts. We obtain improved bounds on the minimax regret via a new approach that exploits the self-concordance property of the logarithmic loss. We show that for any expert class with (sequential) metric entropy $\mathcal{O}(\gamma^{-p})$ at scale $\gamma$, the minimax regret is $\mathcal{O}(n^{\frac{p}{p+1}})$, and that this rate cannot be improved without additional assumptions on the expert class under consideration. As an application of our techniques, we resolve the minimax regret for nonparametric Lipschitz classes of experts.

Chat is not available.