Timezone: »

Data-Dependent Stability of Stochastic Gradient Descent
Ilja Kuzborskij · Christoph H. Lampert

Wed Jul 11 04:30 AM -- 04:50 AM (PDT) @ A6

We establish a data-dependent notion of algorithmic stability for Stochastic Gradient Descent (SGD), and employ it to develop novel generalization bounds. This is in contrast to previous distribution-free algorithmic stability results for SGD which depend on the worst-case constants. By virtue of the data-dependent argument, our bounds provide new insights into learning with SGD on convex and non-convex problems. In the convex case, we show that the bound on the generalization error depends on the risk at the initialization point. In the non-convex case, we prove that the expected curvature of the objective function around the initialization point has crucial influence on the generalization error. In both cases, our results suggest a simple data-driven strategy to stabilize SGD by pre-screening its initialization. As a corollary, our results allow us to show optimistic generalization bounds that exhibit fast convergence rates for SGD subject to a vanishing empirical risk and low noise of stochastic gradient.

Author Information

Ilja Kuzborskij (University of Milan)
Christoph H. Lampert (IST Austria)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors