Skip to yearly menu bar Skip to main content


Talk
in
Workshop: Stein’s Method for Machine Learning and Statistics

Invited Talk - Paul Valiant: How the Ornstein-Uhlenbeck process drives generalization for deep learning.

[ ]
2019 Talk

Abstract:

After discussing the Ornstein-Uhlenbeck process and how it arises in the context of Stein's method, we turn to an analysis of the stochastic gradient descent method that drives deep learning. We show that certain noise that often arises in the training process induces an Ornstein-Uhlenbeck process on the learned parameters. This process is responsible for a weak regularization effect on the training that, once it reaches stable points, will have provable found the "simplest possible" hypothesis consistent with the training data. At a higher level, we argue how some of the big mysteries of the success of deep learning may be revealed by analyses of the subtle stochastic processes which govern deep learning training, a natural focus for this community.

Chat is not available.