Over-parameterization: Pitfalls and Opportunities

Workshop

Over-parameterization: Pitfalls and Opportunities

Yasaman Bahri · Quanquan Gu · Amin Karbasi · Hanie Sedghi

Sat 24 Jul, 9 a.m. PDT

[ Abstract ] Workshop Website

Modern machine learning models are often highly over-parameterized. The prime examples are neural network architectures achieving state-of-the-art performance, which have many more parameters than training examples. While these models can empirically perform very well, they are not well understood. Worst-case theories of learnability do not explain their behavior. Indeed, over-parameterized models sometimes exhibit "benign overfitting", i.e., they have the power to perfectly fit training data (even data modified to have random labels), yet they achieve good performance on the test data. There is evidence that over-parameterization may be helpful both computational and statistically, although attempts to use phenomena like double/multiple descent to explain that over-parameterization helps to achieve small test error remain controversial. Besides benign overfitting and double/multiple descent, many other interesting phenomena arise due to over-parameterization, and many more may have yet to be discovered. Many of these effects depend on the properties of data, but we have only simplistic tools to measure, quantify, and understand data. In light of rapid progress and rapidly shifting understanding, we believe that the time is ripe for a workshop focusing on understanding over-parameterization from multiple angles.

Gathertown room1 link: [ protected link dropped ]
Gathertown room2 link: [ protected link dropped ]

Chat is not available.

Timezone: America/Los_Angeles

Schedule

Sat 9:00 a.m. - 9:05 a.m.	Opening Remarks ( opening ) > SlidesLive Video	🔗
Sat 9:05 a.m. - 9:50 a.m.	Adversarial Examples in Random Deep Networks ( Invited talk ) > SlidesLive Video	Peter Bartlett 🔗
Sat 9:50 a.m. - 10:00 a.m.	Live Q&A with Peter Bartlett ( Live Q&A ) >	🔗
Sat 10:00 a.m. - 10:55 a.m.	The Polyak-Lojasiewicz condition as a framework for over-parameterized optimization and its application to deep learning ( Invited talk ) > SlidesLive Video	Mikhail Belkin 🔗
Sat 10:55 a.m. - 11:10 a.m.	Distributional Generalization: A New Kind of Generalization (Extended Abstract) ( Spotlight ) > SlidesLive Video	Preetum Nakkiran · Yamini Bansal 🔗
Sat 11:10 a.m. - 11:25 a.m.	Understanding the effect of sparsity on neural networks robustness ( Spotlight ) > SlidesLive Video	Lukas Timpl · Rahim Entezari · Hanie Sedghi · Behnam Neyshabur · Olga Saukh 🔗
Sat 11:25 a.m. - 11:40 a.m.	On the Generalization Improvement from Neural Network Pruning ( Spotlight ) > SlidesLive Video	Tian Jin · Gintare Karolina Dziugaite · Michael Carbin 🔗
Sat 12:30 p.m. - 1:25 p.m.	Overparametrization: Insights from solvable models ( Invited talk ) > SlidesLive Video	Lenka Zdeborova 🔗
Sat 1:25 p.m. - 2:20 p.m.	The generalization behavior of random feature and neural tangent models ( Invited talk ) > SlidesLive Video	Andrea Montanari 🔗
Sat 2:20 p.m. - 2:35 p.m.	Towards understanding how momentum improves generalization in deep learning ( Spotlight ) > SlidesLive Video	Samy Jelassi · Yuanzhi Li 🔗
Sat 2:35 p.m. - 2:50 p.m.	Feature Learning in Infinite-Width Neural Networks ( Spotlight ) > SlidesLive Video	Greg Yang · Edward Hu 🔗
Sat 2:50 p.m. - 3:05 p.m.	A Universal Law of Robustness via Isoperimetry ( Spotlight ) > SlidesLive Video	Sebastien Bubeck · Mark Sellke 🔗
Sat 3:55 p.m. - 4:50 p.m.	Universal Prediction Band, Semi-Definite Programming and Variance Interpolation ( Invited talk ) > SlidesLive Video	Tengyuan Liang 🔗
Sat 4:50 p.m. - 5:45 p.m.	Function space view of Multi-Channel Linear Convolutional Networks with Bounded Weight Norm ( Invited talk ) > SlidesLive Video	Suriya Gunasekar 🔗
Sat 5:45 p.m. - 6:00 p.m.	Value-Based Deep Reinforcement Learning Requires Explicit Regularization ( Spotlight ) > SlidesLive Video	Aviral Kumar · Rishabh Agarwal · Aaron Courville · Tengyu Ma · George Tucker · Sergey Levine 🔗
Sat 6:00 p.m. - 6:15 p.m.	Beyond Implicit Regularization: Avoiding Overfitting via Regularizer Mirror Descent ( Spotlight ) > SlidesLive Video	Navid Azizan · Sahin Lale · Babak Hassibi 🔗
Sat 6:15 p.m. - 6:20 p.m.	Closing Remarks ( closing ) > SlidesLive Video	🔗
-	Generalization Error and Overparameterization While Learning over Networks ( Poster ) >	Martin Hellkvist · Ayca Ozcelikkale 🔗
-	On the interplay between data structure and loss function: an analytical study of generalization for classification ( Poster ) >	Stéphane d'Ascoli · Marylou Gabrié · Levent Sagun · Giulio Biroli 🔗
-	Finite-Sample Analysis of Interpolating Linear Classifiers in the Overparameterized Regime ( Poster ) >	Niladri Chatterji · Phil Long 🔗
-	Some samples are more similar than others! A different look at memorization and generalization in neural networks. ( Poster ) >	Sudhanshu Ranjan 🔗
-	When does gradient descent with logistic loss interpolate using deep networks with smoothed ReLU activations? ( Poster ) >	Niladri Chatterji · Phil Long · Peter Bartlett 🔗
-	On Alignment in Deep Linear Neural Networks ( Poster ) >	Adityanarayanan Radhakrishnan · Eshaan Nichani · Daniel Bernstein · Caroline Uhler 🔗
-	Increasing Depth Leads to U-Shaped Test Risk in Over-parameterized Convolutional Networks ( Poster ) >	Eshaan Nichani · Adityanarayanan Radhakrishnan · Caroline Uhler 🔗
-	How does Over-Parametrization Lead to Acceleration for Learning a Single Teacher Neuron with Quadratic Activation? ( Poster ) >	Jun-Kun Wang · Jacob Abernethy 🔗
-	Empirical Study on the Effective VC Dimension of Low-rank Neural Networks ( Poster ) >	Daewon Seo · Hongyi Wang · Dimitris Papailiopoulos · Kangwook Lee 🔗
-	Benign Overfitting in Adversarially Robust Linear Classification ( Poster ) >	Jinghui Chen · Yuan Cao · Yuan Cao · Quanquan Gu 🔗
-	Mitigating deep double descent by concatenating inputs ( Poster ) >	John Chen · Qihan Wang · Anastasios Kyrillidis 🔗
-	Robust Generalization of Quadratic Neural Networks via Function Identification ( Poster ) >	Kan Xu · Hamsa Bastani · Osbert Bastani 🔗
-	Label Noise SGD Provably Prefers Flat Global Minimizers ( Poster ) >	Alex Damian · Tengyu Ma · Jason Lee 🔗
-	On the Origins of the Block Structure Phenomenon in Neural Network Representations ( Poster ) >	Thao Nguyen · Maithra Raghu · Simon Kornblith 🔗
-	Structured Model Pruning of Convolutional Networks on Tensor Processing Units ( Poster ) >	Kongtao Chen 🔗
-	Benign Overfitting in Multiclass Classification: All Roads Lead to Interpolation ( Poster ) >	Ke Wang · Vidya Muthukumar · Christos Thrampoulidis 🔗
-	Inductive Bias of Multi-Channel Linear Convolutional Networks with Bounded Weight Norm ( Poster ) >	Meena Jagadeesan · Ilya Razenshteyn · Suriya Gunasekar 🔗
-	Sample Complexity and Overparameterization Bounds for Temporal Difference Learning with Neural Network Approximation ( Poster ) >	Semih Cayci · Siddhartha Satpathi · Niao He · R Srikant 🔗
-	Double Descent in Feature Selection: Revisiting LASSO and Basis Pursuit ( Poster ) >	David Bosch · Ashkan Panahi · Ayca Ozcelikkale 🔗
-	On Low Rank Training of Deep Neural Networks ( Poster ) >	Siddhartha Kamalakara · Acyr Locatelli · Bharat Venkitesh · Jimmy Ba · Yarin Gal · Aidan Gomez 🔗
-	On the Sparsity of Deep Neural Networks in the Overparameterized Regime: An Empirical Study ( Poster ) >	Rahul Parhi · Jack Wolf · Robert Nowak 🔗
-	Implicit Acceleration and Feature Learning in Infinitely Wide Neural Networks with Bottlenecks ( Poster ) >	Etai Littwin · Omid Saremi · Shuangfei Zhai · Vimal Thilak · Hanlin Goh · Joshua M Susskind · Greg Yang 🔗
-	Classification and Adversarial Examples in an Overparameterized Linear Model: A Signal-Processing Perspective ( Poster ) >	Adhyyan Narang · Vidya Muthukumar · Anant Sahai 🔗
-	Gradient Starvation: A Learning Proclivity in Neural Networks ( Poster ) >	Mohammad Pezeshki · Sékou-Oumar Kaba · Yoshua Bengio · Aaron Courville · Doina Precup · Guillaume Lajoie 🔗
-	Studying the Consistency and Composability of Lottery Ticket Pruning Masks ( Poster ) >	Rajiv Movva · Michael Carbin · Jonathan Frankle 🔗
-	Epoch-Wise Double Descent: A Theory of Multi-scale Feature Learning Dynamics ( Poster ) >	Mohammad Pezeshki · Amartya Mitra · Yoshua Bengio · Guillaume Lajoie 🔗
-	Implicit Greedy Rank Learning in Autoencoders via Overparameterized Linear Networks ( Poster ) >	Shih-Yu Sun · Vimal Thilak · Etai Littwin · Omid Saremi · Joshua M Susskind 🔗
-	Assessing Generalization of SGD via Disagreement Rates ( Poster ) >	YiDing Jiang · Vaishnavh Nagarajan · Zico Kolter 🔗
-	Risk Bounds for Over-parameterized Maximum Margin Classification on Sub-Gaussian Mixtures ( Poster ) >	Yuan Cao · Yuan Cao · Quanquan Gu · Mikhail Belkin 🔗
-	Rethinking compactness in deep neural networks ( Poster ) >	Kateryna Chumachenko · Firas Laakom · Jenni Raitoharju · Alexandros Iosifidis · Moncef Gabbouj 🔗
-	Overfitting of Polynomial Regression with Overparameterization ( Poster ) >	Hugo Fabregues · Berfin Simsek 🔗
-	On the memorization properties of contrastive learning ( Poster ) >	Ildus Sadrtdinov · Nadezhda Chirkova · Ekaterina Lobacheva 🔗
-	Over-Parameterization and Generalization in Audio Classification ( Poster ) >	Khaled Koutini · Khaled Koutini · Hamid Eghbalzadeh · Florian Henkel · Jan Schlüter · Gerhard Widmer 🔗
-	Surprising benefits of ridge regularization for noiseless regression ( Poster ) >	Konstantin Donhauser · Alexandru Tifrea · Michael Aerni · Reinhard Heckel · Fanny Yang 🔗
-	Binary Classification of Gaussian Mixtures: Abundance of Support Vectors, Benign Overfitting and Regularization ( Poster ) >	Ke Wang · Christos Thrampoulidis 🔗
-	Label-Imbalanced and Group-Sensitive Classification under Overparameterization ( Poster ) >	Ganesh Ramachandra Kini · Orestis Paraskevas · Samet Oymak · Christos Thrampoulidis 🔗
-	Early-stopped neural networks are consistent ( Poster ) >	Ziwei Ji · Matus Telgarsky 🔗
-	Distributional Generalization: A New Kind of Generalization (Extended Abstract) ( Poster ) >	Preetum Nakkiran · Yamini Bansal 🔗
-	Feature Learning in Infinite-Width Neural Networks ( Poster ) >	Greg Yang · Edward Hu 🔗
-	On the Generalization Improvement from Neural Network Pruning ( Poster ) >	Tian Jin · Gintare Karolina Dziugaite · Michael Carbin 🔗
-	A Universal Law of Robustness via Isoperimetry ( Poster ) >	Sebastien Bubeck · Mark Sellke 🔗
-	Understanding the effect of sparsity on neural networks robustness ( Poster ) >	Lukas Timpl · Rahim Entezari · Hanie Sedghi · Behnam Neyshabur · Olga Saukh 🔗
-	Beyond Implicit Regularization: Avoiding Overfitting via Regularizer Mirror Descent ( Poster ) >	Navid Azizan · Sahin Lale · Babak Hassibi 🔗
-	Value-Based Deep Reinforcement Learning Requires Explicit Regularization ( Poster ) >	Aviral Kumar · Rishabh Agarwal · Aaron Courville · Tengyu Ma · George Tucker · Sergey Levine 🔗
-	Towards understanding how momentum improves generalization in deep learning ( Poster ) >	Samy Jelassi · Yuanzhi Li 🔗