Bregman meets Lévy: Stochastic Mirror Descent with Heavy-Tailed Noise in Continuous and Discrete Time
Pierre-Louis Cauvin ⋅ Panayotis Mertikopoulos
Abstract
We study the robustness of stochastic mirror descent (SMD) under heavy-tailed noise, focusing on whether the method retains its convergence guarantees when run with infinite-variance stochastic gradient input. To address this question in a principled manner, we begin by introducing a continuous-time model of SMD as a stochastic differential equation (SDE) driven by a centered Lévy noise process with finite $p$-th order moments, $1 < p \leq 2$. This scheme---which we call the Lévy mirror flow (LMF)---arises naturally as the scaling limit of SMD in the presence of heavy-tailed noise. In particular, when $p < 2$---the heavy noise regime---the trajectories of LMF generically exhibit jump discontinuities of arbitrary magnitude which, if frequent enough, lead to infinite variance. Nonetheless, despite this highly singular behavior, we show that LMF attains $\epsilon$-optimality within $\mathcal{O}(\epsilon^{-p/(p-1)})$ time in the convex case, and within $\tilde{\mathcal{O}}(\epsilon^{-1/(p-1)})$ time for (relatively) strongly convex objectives. These guarantees provide a transparent characterization of the impact of frequent long jumps on the convergence of the process, and percolate to a series of matching discrete-time guarantees for several variants of SMD under heavy-tailed noise.
Successful Page Load