Timezone: »

Identifying and Understanding Deep Learning Phenomena
Hanie Sedghi · Samy Bengio · Kenji Hata · Aleksander Madry · Ari Morcos · Behnam Neyshabur · Maithra Raghu · Ali Rahimi · Ludwig Schmidt · Ying Xiao

Sat Jun 15 08:30 AM -- 06:00 PM (PDT) @ Hall B

Our understanding of modern neural networks lags behind their practical successes. As this understanding gap grows, it poses a serious challenge to the future pace of progress because fewer pillars of knowledge will be available to designers of models and algorithms. This workshop aims to close this understanding gap in deep learning. It solicits contributions that view the behavior of deep nets as a natural phenomenon to investigate with methods inspired from the natural sciences, like physics, astronomy, and biology. We solicit empirical work that isolates phenomena in deep nets, describes them quantitatively, and then replicates or falsifies them.

As a starting point for this effort, we focus on the interplay between data, network architecture, and training algorithms. We are looking for contributions that identify precise, reproducible phenomena, as well as systematic studies and evaluations of current beliefs such as “sharp local minima do not generalize well” or “SGD navigates out of local minima”. Through the workshop, we hope to catalogue quantifiable versions of such statements, as well as demonstrate whether or not they occur reproducibly.

Sat 8:45 a.m. - 9:00 a.m.
[ Video

Hanie Sedghi

Sat 9:00 a.m. - 9:30 a.m.
Nati Srebro: Optimization’s Untold Gift to Learning: Implicit Regularization (Talk)
Nati Srebro
Sat 9:30 a.m. - 9:45 a.m.
Bad Global Minima Exist and SGD Can Reach Them (Spotlight) [ Video
Sat 9:45 a.m. - 10:00 a.m.
Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask (Spotlight) [ Video
Sat 10:00 a.m. - 10:30 a.m.
Chiyuan Zhang: Are all layers created equal? -- Studies on how neural networks represent functions (Talk) [ Video
Sat 10:30 a.m. - 11:00 a.m.
Break and Posters
Sat 11:00 a.m. - 11:15 a.m.
Line attractor dynamics in recurrent networks for sentiment classification (Spotlight)
Sat 11:15 a.m. - 11:30 a.m.
Do deep neural networks learn shallow learnable examples first? (Spotlight) [ Video
Sat 11:30 a.m. - 12:00 p.m.
Crowdsourcing Deep Learning Phenomena [ Video
Sat 12:00 p.m. - 1:30 p.m.
Lunch and Posters
Sat 1:30 p.m. - 2:00 p.m.
Aude Oliva: Reverse engineering neuroscience and cognitive science principles (Talk) [ Video
Sat 2:00 p.m. - 2:15 p.m.
On Understanding the Hardness of Samples in Neural Networks (Spotlight)
Sat 2:15 p.m. - 2:30 p.m.
On the Convex Behavior of Deep Neural Networks in Relation to the Layers' Width (Spotlight) [ Video
Sat 2:30 p.m. - 3:00 p.m.
[ Video

In this talk I will describe several phenomena related to learning dynamics in deep networks. Among these are (a) large transient training error spikes during full batch gradient descent, with implications for the training error surface; (b) surprisingly strong generalization performance of large networks with modest label noise even with infinite training time; (c) a training speed/test accuracy trade off in vanilla deep networks; (d) the inability of deep networks to learn known efficient representations of certain functions; and finally (e) a trade off between training speed and multitasking ability.

Andrew Saxe
Sat 3:00 p.m. - 4:00 p.m.
Break and Posters
Sat 4:00 p.m. - 4:30 p.m.
[ Video

Olga Russakovsky

Olga Russakovsky
Sat 4:30 p.m. - 5:30 p.m.
[ Video

Panelists: Kevin Murphy, Nati Srebro, Aude Oliva, Andrew Saxe, Olga Russakovsky
Moderator: Ali Rahimi

Author Information

Hanie Sedghi (Google Brain)
Samy Bengio (Google Research Brain Team)
Kenji Hata (Google)
Aleksander Madry (MIT)
Ari Morcos (Facebook AI Research (FAIR))
Behnam Neyshabur (Google)
Maithra Raghu (Cornell University / Google Brain)
Ali Rahimi (Google)
Ludwig Schmidt (University of California, Berkeley)
Ying Xiao (Google)

More from the Same Authors