Timezone: »
A recent series of theoretical works showed that the dynamics of neural networks with a certain initialisation are well-captured by kernel methods. Concurrent empirical work demonstrated that kernel methods can come close to the performance of neural networks on some image classification tasks. These results raise the question of whether neural networks only learn successfully if kernels also learn successfully, despite being the more expressive function class. Here, we show that two-layer neural networks with only a few neurons achieve near-optimal performance on high-dimensional Gaussian mixture classification while lazy training approaches such as random features and kernel methods do not. Our analysis is based on the derivation of a set of ordinary differential equations that exactly track the dynamics of the network and thus allow to extract the asymptotic performance of the network as a function of regularisation or signal-to-noise ratio. We also show how over-parametrising the neural network leads to faster convergence, but does not improve its final performance.
Author Information
Maria Refinetti (Laboratoire de Physique de l’Ecole Normale Supérieure Paris)
Sebastian Goldt (International School of Advanced Studies (SISSA))
I'm an assistant professor working on theories of learning in neural networks.
FLORENT KRZAKALA (EPFL)
Lenka Zdeborova (EPFL)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Poster: Classifying high-dimensional Gaussian mixtures: Where kernel methods fail and neural networks succeed »
Wed. Jul 21st 04:00 -- 06:00 PM Room
More from the Same Authors
-
2023 Poster: Are Gaussian Data All You Need? The Extents and Limits of Universality in High-Dimensional Generalized Linear Estimation »
Luca Pesce · FLORENT KRZAKALA · Bruno Loureiro · Ludovic Stephan -
2023 Poster: Neural networks trained with SGD learn distributions of increasing complexity »
Maria Refinetti · Alessandro Ingrosso · Sebastian Goldt -
2023 Poster: Bayes-optimal Learning of Deep Random Networks of Extensive-width »
Hugo Cui · FLORENT KRZAKALA · Lenka Zdeborova -
2023 Oral: Bayes-optimal Learning of Deep Random Networks of Extensive-width »
Hugo Cui · FLORENT KRZAKALA · Lenka Zdeborova -
2022 Poster: The dynamics of representation learning in shallow, non-linear autoencoders »
Maria Refinetti · Sebastian Goldt -
2022 Poster: Fluctuations, Bias, Variance & Ensemble of Learners: Exact Asymptotics for Convex Losses in High-Dimension »
Bruno Loureiro · Cedric Gerbelot · Maria Refinetti · Gabriele Sicuro · FLORENT KRZAKALA -
2022 Poster: Maslow's Hammer in Catastrophic Forgetting: Node Re-Use vs. Node Activation »
Sebastian Lee · Stefano Sarao Mannelli · Claudia Clopath · Sebastian Goldt · Andrew Saxe -
2022 Spotlight: Maslow's Hammer in Catastrophic Forgetting: Node Re-Use vs. Node Activation »
Sebastian Lee · Stefano Sarao Mannelli · Claudia Clopath · Sebastian Goldt · Andrew Saxe -
2022 Spotlight: The dynamics of representation learning in shallow, non-linear autoencoders »
Maria Refinetti · Sebastian Goldt -
2022 Spotlight: Fluctuations, Bias, Variance & Ensemble of Learners: Exact Asymptotics for Convex Losses in High-Dimension »
Bruno Loureiro · Cedric Gerbelot · Maria Refinetti · Gabriele Sicuro · FLORENT KRZAKALA -
2021 : Overparametrization: Insights from solvable models »
Lenka Zdeborova -
2021 Poster: Align, then memorise: the dynamics of learning with feedback alignment »
Maria Refinetti · Stéphane d'Ascoli · Ruben Ohana · Sebastian Goldt -
2021 Spotlight: Align, then memorise: the dynamics of learning with feedback alignment »
Maria Refinetti · Stéphane d'Ascoli · Ruben Ohana · Sebastian Goldt -
2021 Poster: Continual Learning in the Teacher-Student Setup: Impact of Task Similarity »
Sebastian Lee · Sebastian Goldt · Andrew Saxe -
2021 Spotlight: Continual Learning in the Teacher-Student Setup: Impact of Task Similarity »
Sebastian Lee · Sebastian Goldt · Andrew Saxe -
2020 Poster: Generalisation error in learning with random features and the hidden manifold model »
Federica Gerace · Bruno Loureiro · Florent Krzakala · Marc Mezard · Lenka Zdeborova -
2020 Poster: Double Trouble in Double Descent: Bias and Variance(s) in the Lazy Regime »
Stéphane d'Ascoli · Maria Refinetti · Giulio Biroli · Florent Krzakala -
2020 Poster: The Role of Regularization in Classification of High-dimensional Noisy Gaussian Mixture »
Francesca Mignacco · Florent Krzakala · Yue Lu · Pierfrancesco Urbani · Lenka Zdeborova -
2019 : Poster discussion »
Roman Novak · Maxime Gabella · Frederic Dreyer · Siavash Golkar · Anh Tong · Irina Higgins · Mirco Milletari · Joe Antognini · Sebastian Goldt · Adín Ramírez Rivera · Roberto Bondesan · Ryo Karakida · Remi Tachet des Combes · Michael Mahoney · Nicholas Walker · Stanislav Fort · Samuel Smith · Rohan Ghosh · Aristide Baratin · Diego Granziol · Stephen Roberts · Dmitry Vetrov · Andrew Wilson · César Laurent · Valentin Thomas · Simon Lacoste-Julien · Dar Gilboa · Daniel Soudry · Anupam Gupta · Anirudh Goyal · Yoshua Bengio · Erich Elsen · Soham De · Stanislaw Jastrzebski · Charles H Martin · Samira Shabanian · Aaron Courville · Shorato Akaho · Lenka Zdeborova · Ethan Dyer · Maurice Weiler · Pim de Haan · Taco Cohen · Max Welling · Ping Luo · zhanglin peng · Nasim Rahaman · Loic Matthey · Danilo J. Rezende · Jaesik Choi · Kyle Cranmer · Lechao Xiao · Jaehoon Lee · Yasaman Bahri · Jeffrey Pennington · Greg Yang · Jiri Hron · Jascha Sohl-Dickstein · Guy Gur-Ari -
2019 : Analyzing the dynamics of online learning in over-parameterized two-layer neural networks »
Sebastian Goldt -
2019 : Loss landscape and behaviour of algorithms in the spiked matrix-tensor model »
Lenka Zdeborova -
2019 Poster: Passed & Spurious: Descent Algorithms and Local Minima in Spiked Matrix-Tensor Models »
Stefano Sarao Mannelli · Florent Krzakala · Pierfrancesco Urbani · Lenka Zdeborova -
2019 Oral: Passed & Spurious: Descent Algorithms and Local Minima in Spiked Matrix-Tensor Models »
Stefano Sarao Mannelli · Florent Krzakala · Pierfrancesco Urbani · Lenka Zdeborova