Timezone: »

Asymmetric Heavy Tails and Implicit Bias in Gaussian Noise Injections
Alexander D Camuto · Xiaoyu Wang · Lingjiong Zhu · Christopher Holmes · Mert Gurbuzbalaban · Umut Simsekli

Wed Jul 21 05:20 PM -- 05:25 PM (PDT) @ None

Gaussian noise injections (GNIs) are a family of simple and widely-used regularisation methods for training neural networks, where one injects additive or multiplicative Gaussian noise to the network activations at every iteration of the optimisation algorithm, which is typically chosen as stochastic gradient descent (SGD). In this paper, we focus on the so-called implicit effect' of GNIs, which is the effect of the injected noise on the dynamics of SGD. We show that this effect induces an \emph{asymmetric heavy-tailed noise} on SGD gradient updates. In order to model this modified dynamics, we first develop a Langevin-like stochastic differential equation that is driven by a general family of \emph{asymmetric} heavy-tailed noise. Using this model we then formally prove that GNIs induce animplicit bias', which varies depending on the heaviness of the tails and the level of asymmetry. Our empirical results confirm that different types of neural networks trained with GNIs are well-modelled by the proposed dynamics and that the implicit effect of these injections induces a bias that degrades the performance of networks.

Author Information

Alexander D Camuto (University of Oxford)
Xiaoyu Wang (Florida State University)
Lingjiong Zhu (Florida State University)
Christopher Holmes (University of Oxford)
Mert Gurbuzbalaban (Rutgers University)
Umut Simsekli (Inria/ENS)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors