Skip to yearly menu bar Skip to main content


Invited talk
in
Workshop: Identifying and Understanding Deep Learning Phenomena

Andrew Saxe: Intriguing phenomena in training and generalization dynamics of deep networks

Andrew Saxe

[ ]
[ Video
2019 Invited talk

Abstract:

In this talk I will describe several phenomena related to learning dynamics in deep networks. Among these are (a) large transient training error spikes during full batch gradient descent, with implications for the training error surface; (b) surprisingly strong generalization performance of large networks with modest label noise even with infinite training time; (c) a training speed/test accuracy trade off in vanilla deep networks; (d) the inability of deep networks to learn known efficient representations of certain functions; and finally (e) a trade off between training speed and multitasking ability.

Chat is not available.