Non-Stationary Online Structured Prediction with Surrogate Losses
Shinsaku Sakaue ⋅ Han Bao ⋅ Yuzhou Cao
Abstract
Online structured prediction, including online classification as a special case, is the task of sequentially predicting labels from input features. In this setting, the *surrogate regret*—the cumulative excess of the actual target loss (e.g., 0–1 loss) over the surrogate loss (e.g., logistic loss) incurred by the best fixed estimator—has gained attention because it admits a finite bound independent of the time horizon $T$. However, such guarantees break down in *non-stationary* environments, where every fixed estimator may incur surrogate loss that grows linearly with $T$. To address this limitation, we obtain an upper bound of $F_T + O(1 + P_T)$ on the cumulative target loss, where $F_T$ is the cumulative surrogate loss of any comparator sequence and $P_T$ is its *path length*. This bound depends on $T$ only through $F_T$ and $P_T$, thus offering stronger guarantees under non-stationarity. Our core idea is to combine the dynamic regret analysis of online gradient descent (OGD) with the *exploit-the-surrogate-gap* technique. This viewpoint sheds light on the usefulness of a Polyak-style learning rate for OGD, which systematically yields target-loss bounds and performs well empirically. We then extend our approach to broader settings beyond prior work via the *convolutional Fenchel–Young loss*. Finally, a lower bound shows that the dependence on $F_T$ and $P_T$ is tight.
Successful Page Load