Skip to yearly menu bar Skip to main content


Tutorial

Toward Theoretical Understanding of Deep Learning

Sanjeev Arora

Victoria

Abstract:

We survey progress in recent years toward developing a theory of deep learning. Works have started addressing issues such as: (a) the effect of architecture choices on the optimization landscape, training speed, and expressiveness (b) quantifying the true "capacity" of the net, as a step towards understanding why nets with hugely more parameters than training examples nevertheless do not overfit (c) understanding inherent power and limitations of deep generative models, especially (various flavors of) generative adversarial nets (GANs) (d) understanding properties of simple RNN-style language models and some of their solutions (word embeddings and sentence embeddings)

While these are early results, they help illustrate what kind of theory may ultimately arise for deep learning.

The tutorial website: http://unsupervised.cs.princeton.edu/deeplearningtutorial.html

Live content is unavailable. Log in and register to view live content