Timezone: »

Andrew Saxe: Intriguing phenomena in training and generalization dynamics of deep networks
Andrew Saxe

Sat Jun 15 02:30 PM -- 03:00 PM (PDT) @

In this talk I will describe several phenomena related to learning dynamics in deep networks. Among these are (a) large transient training error spikes during full batch gradient descent, with implications for the training error surface; (b) surprisingly strong generalization performance of large networks with modest label noise even with infinite training time; (c) a training speed/test accuracy trade off in vanilla deep networks; (d) the inability of deep networks to learn known efficient representations of certain functions; and finally (e) a trade off between training speed and multitasking ability.

Author Information

Andrew Saxe (University of Oxford)

More from the Same Authors