Timezone: »
Our understanding of modern neural networks lags behind their practical successes. As this understanding gap grows, it poses a serious challenge to the future pace of progress because fewer pillars of knowledge will be available to designers of models and algorithms. This workshop aims to close this understanding gap in deep learning. It solicits contributions that view the behavior of deep nets as a natural phenomenon to investigate with methods inspired from the natural sciences, like physics, astronomy, and biology. We solicit empirical work that isolates phenomena in deep nets, describes them quantitatively, and then replicates or falsifies them.
As a starting point for this effort, we focus on the interplay between data, network architecture, and training algorithms. We are looking for contributions that identify precise, reproducible phenomena, as well as systematic studies and evaluations of current beliefs such as “sharp local minima do not generalize well” or “SGD navigates out of local minima”. Through the workshop, we hope to catalogue quantifiable versions of such statements, as well as demonstrate whether or not they occur reproducibly.
Sat 8:45 a.m. - 9:00 a.m.
|
Opening Remarks
Hanie Sedghi |
🔗 |
Sat 9:00 a.m. - 9:30 a.m.
|
Nati Srebro: Optimization’s Untold Gift to Learning: Implicit Regularization
(
Talk
)
|
Nati Srebro 🔗 |
Sat 9:30 a.m. - 9:45 a.m.
|
Bad Global Minima Exist and SGD Can Reach Them
(
Spotlight
)
|
🔗 |
Sat 9:45 a.m. - 10:00 a.m.
|
Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask
(
Spotlight
)
|
🔗 |
Sat 10:00 a.m. - 10:30 a.m.
|
Chiyuan Zhang: Are all layers created equal? -- Studies on how neural networks represent functions
(
Talk
)
|
🔗 |
Sat 10:30 a.m. - 11:00 a.m.
|
Break and Posters
|
🔗 |
Sat 11:00 a.m. - 11:15 a.m.
|
Line attractor dynamics in recurrent networks for sentiment classification
(
Spotlight
)
|
🔗 |
Sat 11:15 a.m. - 11:30 a.m.
|
Do deep neural networks learn shallow learnable examples first?
(
Spotlight
)
|
🔗 |
Sat 11:30 a.m. - 12:00 p.m.
|
Crowdsourcing Deep Learning Phenomena
|
🔗 |
Sat 12:00 p.m. - 1:30 p.m.
|
Lunch and Posters
|
🔗 |
Sat 1:30 p.m. - 2:00 p.m.
|
Aude Oliva: Reverse engineering neuroscience and cognitive science principles
(
Talk
)
|
🔗 |
Sat 2:00 p.m. - 2:15 p.m.
|
On Understanding the Hardness of Samples in Neural Networks
(
Spotlight
)
|
🔗 |
Sat 2:15 p.m. - 2:30 p.m.
|
On the Convex Behavior of Deep Neural Networks in Relation to the Layers' Width
(
Spotlight
)
|
🔗 |
Sat 2:30 p.m. - 3:00 p.m.
|
Andrew Saxe: Intriguing phenomena in training and generalization dynamics of deep networks
(
Invited talk
)
In this talk I will describe several phenomena related to learning dynamics in deep networks. Among these are (a) large transient training error spikes during full batch gradient descent, with implications for the training error surface; (b) surprisingly strong generalization performance of large networks with modest label noise even with infinite training time; (c) a training speed/test accuracy trade off in vanilla deep networks; (d) the inability of deep networks to learn known efficient representations of certain functions; and finally (e) a trade off between training speed and multitasking ability. |
Andrew Saxe 🔗 |
Sat 3:00 p.m. - 4:00 p.m.
|
Break and Posters
|
🔗 |
Sat 4:00 p.m. - 4:30 p.m.
|
Olga Russakovsky
(
Invited Talk
)
Olga Russakovsky |
Olga Russakovsky 🔗 |
Sat 4:30 p.m. - 5:30 p.m.
|
Panel Discussion: Kevin Murphy, Nati Srebro, Aude Oliva, Andrew Saxe, Olga Russakovsky Moderator: Ali Rahimi
(
Panel Discussion
)
Panelists:
Kevin Murphy, Nati Srebro, Aude Oliva, Andrew Saxe, Olga Russakovsky |
🔗 |
Author Information
Hanie Sedghi (Google Brain)
Samy Bengio (Google Research Brain Team)
Kenji Hata (Google)
Aleksander Madry (MIT)
Ari Morcos (Facebook AI Research (FAIR))
Behnam Neyshabur (Google)
Maithra Raghu (Cornell University / Google Brain)
Ali Rahimi (Google)
Ludwig Schmidt (University of California, Berkeley)
Ying Xiao (Google)
More from the Same Authors
-
2022 : A Game-Theoretic Perspective on Trust in Recommendation »
Sarah Cen · Andrew Ilyas · Aleksander Madry -
2023 : ModelDiff: A Framework for Comparing Learning Algorithms »
Harshay Shah · Sung Min (Sam) Park · Andrew Ilyas · Aleksander Madry -
2023 : Dataset Interfaces: Diagnosing Model Failures Using Controllable Counterfactual Generation »
Joshua Vendrow · Saachi Jain · Logan Engstrom · Aleksander Madry -
2023 : SemDeDup: Data-efficient learning at web-scale through semantic deduplication »
Amro Abbas · Daniel Simig · Surya Ganguli · Ari Morcos · Kushal Tirumala -
2023 : D4: Document Deduplication and Diversification »
Kushal Tirumala · Daniel Simig · Armen Aghajanyan · Ari Morcos -
2023 : What Works in Chest X-Ray Classification? A Case Study of Design Choices »
Evan Vogelbaum · Logan Engstrom · Aleksander Madry -
2023 : The Journey, Not the Destination: How Data Guides Diffusion Models »
Kristian Georgiev · Joshua Vendrow · Hadi Salman · Sung Min (Sam) Park · Aleksander Madry -
2023 : Panel on Reasoning Capabilities of LLMs »
Guy Van den Broeck · Ishita Dasgupta · Subbarao Kambhampati · Jiajun Wu · Xi Victoria Lin · Samy Bengio · Beliz Gunel -
2023 : Generalization on the Unseen, Logic Reasoning and Degree Curriculum »
Samy Bengio -
2023 Panel: The Societal Impacts of AI »
Sanmi Koyejo · Samy Bengio · Ashia Wilson · Kirikowhai Mikaere · Joelle Pineau -
2023 Poster: TRAK: Attributing Model Behavior at Scale »
Sung Min (Sam) Park · Kristian Georgiev · Andrew Ilyas · Guillaume Leclerc · Aleksander Madry -
2023 Oral: TRAK: Attributing Model Behavior at Scale »
Sung Min (Sam) Park · Kristian Georgiev · Andrew Ilyas · Guillaume Leclerc · Aleksander Madry -
2023 Poster: Generalization on the Unseen, Logic Reasoning and Degree Curriculum »
Emmanuel Abbe · Samy Bengio · Aryo Lotfi · Kevin Rizk -
2023 Poster: ModelDiff: A Framework for Comparing Learning Algorithms »
Harshay Shah · Sung Min (Sam) Park · Andrew Ilyas · Aleksander Madry -
2023 Oral: Generalization on the Unseen, Logic Reasoning and Degree Curriculum »
Emmanuel Abbe · Samy Bengio · Aryo Lotfi · Kevin Rizk -
2023 Oral: Raising the Cost of Malicious AI-Powered Image Editing »
Hadi Salman · Alaa Khaddaj · Guillaume Leclerc · Andrew Ilyas · Aleksander Madry -
2023 Poster: Rethinking Backdoor Attacks »
Alaa Khaddaj · Guillaume Leclerc · Aleksandar Makelov · Kristian Georgiev · Hadi Salman · Andrew Ilyas · Aleksander Madry -
2023 Poster: Raising the Cost of Malicious AI-Powered Image Editing »
Hadi Salman · Alaa Khaddaj · Guillaume Leclerc · Andrew Ilyas · Aleksander Madry -
2022 : How Neural Networks See, Learn and Forget »
Maithra Raghu -
2022 : Panel discussion »
Steffen Schneider · Aleksander Madry · Alexei Efros · Chelsea Finn · Soheil Feizi -
2022 : Dr. Aleksander Madry's Talk »
Aleksander Madry -
2022 : Invited Talk 1: Aleksander Mądry »
Aleksander Madry -
2022 Workshop: Knowledge Retrieval and Language Models »
Maithra Raghu · Urvashi Khandelwal · Chiyuan Zhang · Matei Zaharia · Alexander Rush -
2022 Poster: Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time »
Mitchell Wortsman · Gabriel Ilharco · Samir Gadre · Becca Roelofs · Raphael Gontijo Lopes · Ari Morcos · Hongseok Namkoong · Ali Farhadi · Yair Carmon · Simon Kornblith · Ludwig Schmidt -
2022 Spotlight: Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time »
Mitchell Wortsman · Gabriel Ilharco · Samir Gadre · Becca Roelofs · Raphael Gontijo Lopes · Ari Morcos · Hongseok Namkoong · Ali Farhadi · Yair Carmon · Simon Kornblith · Ludwig Schmidt -
2022 Poster: COAT: Measuring Object Compositionality in Emergent Representations »
Sirui Xie · Ari Morcos · Song-Chun Zhu · Shanmukha Ramakrishna Vedantam -
2022 Poster: Datamodels: Understanding Predictions with Data and Data with Predictions »
Andrew Ilyas · Sung Min (Sam) Park · Logan Engstrom · Guillaume Leclerc · Aleksander Madry -
2022 Poster: Adversarially trained neural representations are already as robust as biological neural representations »
Chong Guo · Michael Lee · Guillaume Leclerc · Joel Dapello · Yug Rao · Aleksander Madry · James DiCarlo -
2022 Oral: Adversarially trained neural representations are already as robust as biological neural representations »
Chong Guo · Michael Lee · Guillaume Leclerc · Joel Dapello · Yug Rao · Aleksander Madry · James DiCarlo -
2022 Spotlight: Datamodels: Understanding Predictions with Data and Data with Predictions »
Andrew Ilyas · Sung Min (Sam) Park · Logan Engstrom · Guillaume Leclerc · Aleksander Madry -
2022 Spotlight: COAT: Measuring Object Compositionality in Emergent Representations »
Sirui Xie · Ari Morcos · Song-Chun Zhu · Shanmukha Ramakrishna Vedantam -
2022 Poster: Combining Diverse Feature Priors »
Saachi Jain · Dimitris Tsipras · Aleksander Madry -
2022 Spotlight: Combining Diverse Feature Priors »
Saachi Jain · Dimitris Tsipras · Aleksander Madry -
2021 : Invited Talk #4 »
Aleksander Madry -
2021 Poster: CURI: A Benchmark for Productive Concept Learning Under Uncertainty »
Shanmukha Ramakrishna Vedantam · Arthur Szlam · Maximilian Nickel · Ari Morcos · Brenden Lake -
2021 Spotlight: CURI: A Benchmark for Productive Concept Learning Under Uncertainty »
Shanmukha Ramakrishna Vedantam · Arthur Szlam · Maximilian Nickel · Ari Morcos · Brenden Lake -
2021 Poster: ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases »
Stéphane d'Ascoli · Hugo Touvron · Matthew Leavitt · Ari Morcos · Giulio Biroli · Levent Sagun -
2021 Poster: Leveraging Sparse Linear Layers for Debuggable Deep Networks »
Eric Wong · Shibani Santurkar · Aleksander Madry -
2021 Spotlight: ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases »
Stéphane d'Ascoli · Hugo Touvron · Matthew Leavitt · Ari Morcos · Giulio Biroli · Levent Sagun -
2021 Oral: Leveraging Sparse Linear Layers for Debuggable Deep Networks »
Eric Wong · Shibani Santurkar · Aleksander Madry -
2020 Poster: Neural Kernels Without Tangents »
Vaishaal Shankar · Alex Fang · Wenshuo Guo · Sara Fridovich-Keil · Jonathan Ragan-Kelley · Ludwig Schmidt · Benjamin Recht -
2020 Poster: Evaluating Machine Accuracy on ImageNet »
Vaishaal Shankar · Rebecca Roelofs · Horia Mania · Alex Fang · Benjamin Recht · Ludwig Schmidt -
2020 Poster: From ImageNet to Image Classification: Contextualizing Progress on Benchmarks »
Dimitris Tsipras · Shibani Santurkar · Logan Engstrom · Andrew Ilyas · Aleksander Madry -
2020 Poster: Identifying Statistical Bias in Dataset Replication »
Logan Engstrom · Andrew Ilyas · Shibani Santurkar · Dimitris Tsipras · Jacob Steinhardt · Aleksander Madry -
2020 Poster: The Effect of Natural Distribution Shift on Question Answering Models »
John Miller · Karl Krauth · Benjamin Recht · Ludwig Schmidt -
2020 Affinity Workshop: New In ML »
Zhen Xu · Sparkle Russell-Puleri · Zhengying Liu · Sinead A Williamson · Matthias W Seeger · Wei-Wei Tu · Samy Bengio · Isabelle Guyon -
2019 : Panel Discussion (Nati Srebro, Dan Roy, Chelsea Finn, Mikhail Belkin, Aleksander Mądry, Jason Lee) »
Nati Srebro · Daniel Roy · Chelsea Finn · Mikhail Belkin · Aleksander Madry · Jason Lee -
2019 : Keynote by Aleksander Mądry: Are All Features Created Equal? »
Aleksander Madry -
2019 Workshop: Understanding and Improving Generalization in Deep Learning »
Dilip Krishnan · Hossein Mobahi · Behnam Neyshabur · Behnam Neyshabur · Peter Bartlett · Dawn Song · Nati Srebro -
2019 Poster: Faster Algorithms for Binary Matrix Factorization »
Ravi Kumar · Rina Panigrahy · Ali Rahimi · David Woodruff -
2019 Oral: Faster Algorithms for Binary Matrix Factorization »
Ravi Kumar · Rina Panigrahy · Ali Rahimi · David Woodruff -
2019 Poster: Direct Uncertainty Prediction for Medical Second Opinions »
Maithra Raghu · Katy Blumer · Rory sayres · Ziad Obermeyer · Bobby Kleinberg · Sendhil Mullainathan · Jon Kleinberg -
2019 Poster: Exploring the Landscape of Spatial Robustness »
Logan Engstrom · Brandon Tran · Dimitris Tsipras · Ludwig Schmidt · Aleksander Madry -
2019 Poster: Do ImageNet Classifiers Generalize to ImageNet? »
Benjamin Recht · Rebecca Roelofs · Ludwig Schmidt · Vaishaal Shankar -
2019 Oral: Exploring the Landscape of Spatial Robustness »
Logan Engstrom · Brandon Tran · Dimitris Tsipras · Ludwig Schmidt · Aleksander Madry -
2019 Oral: Do ImageNet Classifiers Generalize to ImageNet? »
Benjamin Recht · Rebecca Roelofs · Ludwig Schmidt · Vaishaal Shankar -
2019 Oral: Direct Uncertainty Prediction for Medical Second Opinions »
Maithra Raghu · Katy Blumer · Rory sayres · Ziad Obermeyer · Bobby Kleinberg · Sendhil Mullainathan · Jon Kleinberg -
2019 Poster: Area Attention »
Yang Li · Lukasz Kaiser · Samy Bengio · Si Si -
2019 Oral: Area Attention »
Yang Li · Lukasz Kaiser · Samy Bengio · Si Si -
2018 Poster: On the Limitations of First-Order Approximation in GAN Dynamics »
Jerry Li · Aleksander Madry · John Peebles · Ludwig Schmidt -
2018 Oral: On the Limitations of First-Order Approximation in GAN Dynamics »
Jerry Li · Aleksander Madry · John Peebles · Ludwig Schmidt -
2018 Poster: Measuring abstract reasoning in neural networks »
Adam Santoro · Feilx Hill · David GT Barrett · Ari S Morcos · Timothy Lillicrap -
2018 Poster: Can Deep Reinforcement Learning Solve Erdos-Selfridge-Spencer Games? »
Maithra Raghu · Alexander Irpan · Jacob Andreas · Bobby Kleinberg · Quoc Le · Jon Kleinberg -
2018 Poster: Stronger Generalization Bounds for Deep Nets via a Compression Approach »
Sanjeev Arora · Rong Ge · Behnam Neyshabur · Yi Zhang -
2018 Poster: Fast Decoding in Sequence Models Using Discrete Latent Variables »
Lukasz Kaiser · Samy Bengio · Aurko Roy · Ashish Vaswani · Niki Parmar · Jakob Uszkoreit · Noam Shazeer -
2018 Oral: Measuring abstract reasoning in neural networks »
Adam Santoro · Feilx Hill · David GT Barrett · Ari S Morcos · Timothy Lillicrap -
2018 Oral: Stronger Generalization Bounds for Deep Nets via a Compression Approach »
Sanjeev Arora · Rong Ge · Behnam Neyshabur · Yi Zhang -
2018 Oral: Fast Decoding in Sequence Models Using Discrete Latent Variables »
Lukasz Kaiser · Samy Bengio · Aurko Roy · Ashish Vaswani · Niki Parmar · Jakob Uszkoreit · Noam Shazeer -
2018 Oral: Can Deep Reinforcement Learning Solve Erdos-Selfridge-Spencer Games? »
Maithra Raghu · Alexander Irpan · Jacob Andreas · Bobby Kleinberg · Quoc Le · Jon Kleinberg -
2018 Poster: A Classification-Based Study of Covariate Shift in GAN Distributions »
Shibani Santurkar · Ludwig Schmidt · Aleksander Madry -
2018 Oral: A Classification-Based Study of Covariate Shift in GAN Distributions »
Shibani Santurkar · Ludwig Schmidt · Aleksander Madry -
2017 Workshop: Reproducibility in Machine Learning Research »
Rosemary Nan Ke · Anirudh Goyal · Alex Lamb · Joelle Pineau · Samy Bengio · Yoshua Bengio -
2017 Poster: Device Placement Optimization with Reinforcement Learning »
Azalia Mirhoseini · Hieu Pham · Quoc Le · benoit steiner · Mohammad Norouzi · Rasmus Larsen · Yuefeng Zhou · Naveen Kumar · Samy Bengio · Jeff Dean -
2017 Talk: Device Placement Optimization with Reinforcement Learning »
Azalia Mirhoseini · Hieu Pham · Quoc Le · benoit steiner · Mohammad Norouzi · Rasmus Larsen · Yuefeng Zhou · Naveen Kumar · Samy Bengio · Jeff Dean -
2017 Poster: Sharp Minima Can Generalize For Deep Nets »
Laurent Dinh · Razvan Pascanu · Samy Bengio · Yoshua Bengio -
2017 Poster: On the Expressive Power of Deep Neural Networks »
Maithra Raghu · Ben Poole · Surya Ganguli · Jon Kleinberg · Jascha Sohl-Dickstein -
2017 Talk: On the Expressive Power of Deep Neural Networks »
Maithra Raghu · Ben Poole · Surya Ganguli · Jon Kleinberg · Jascha Sohl-Dickstein -
2017 Talk: Sharp Minima Can Generalize For Deep Nets »
Laurent Dinh · Razvan Pascanu · Samy Bengio · Yoshua Bengio