Timezone: »
Neural networks with a large number of parameters admit a mean-field description, which has recently served as a theoretical explanation for the favorable training properties of models with a large number of parameters. In this regime, gradient descent obeys a deterministic partial differential equation (PDE) that converges to a globally optimal solution for networks with a single hidden layer under appropriate assumptions. In this work, we propose a non-local mass transport dynamics that leads to a modified PDE with the same minimizer. We implement this non-local dynamics as a stochastic neuronal birth/death process and we prove that it accelerates the rate of convergence in the mean-field limit. We subsequently realize this PDE with two classes of numerical schemes that converge to the mean-field equation, each of which can easily be implemented for neural networks with finite numbers of parameters. We illustrate our algorithms with two models to provide intuition for the mechanism through which convergence is accelerated.
Author Information
Grant Rotskoff (New York University)
Samy Jelassi (Princeton University)
Joan Bruna (New York University)
Eric Vanden-Eijnden (New York University)
Related Events (a corresponding poster, oral, or spotlight)
-
2019 Poster: Neuron birth-death dynamics accelerates gradient descent and converges asymptotically »
Fri. Jun 14th 01:30 -- 04:00 AM Room Pacific Ballroom #93
More from the Same Authors
-
2021 : Towards understanding how momentum improves generalization in deep learning »
Samy Jelassi · Yuanzhi Li -
2023 : Reliable coarse-grained turbulent simulations through combined offline learning and neural emulation »
Chris Pedersen · Laure Zanna · Joan Bruna · Pavel Perezhogin -
2023 Poster: Conditionally Strongly Log-Concave Generative Models »
Florentin Guth · Etienne Lempereur · Joan Bruna · Stéphane Mallat -
2023 Poster: Beyond the Edge of Stability via Two-step Gradient Updates »
Lei Chen · Joan Bruna -
2022 Poster: Towards understanding how momentum improves generalization in deep learning »
Samy Jelassi · Yuanzhi Li -
2022 Poster: Extended Unconstrained Features Model for Exploring Deep Neural Collapse »
Tom Tirer · Joan Bruna -
2022 Spotlight: Extended Unconstrained Features Model for Exploring Deep Neural Collapse »
Tom Tirer · Joan Bruna -
2022 Spotlight: Towards understanding how momentum improves generalization in deep learning »
Samy Jelassi · Yuanzhi Li -
2021 : Towards understanding how momentum improves generalization in deep learning »
Samy Jelassi · Yuanzhi Li -
2021 Workshop: ICML Workshop on Representation Learning for Finance and E-Commerce Applications »
Senthil Kumar · Sameena Shah · Joan Bruna · Tom Goldstein · Erik Mueller · Oleg Rokhlenko · Hongxia Yang · Jianpeng Xu · Oluwatobi O Olabiyi · Charese Smiley · C. Bayan Bruss · Saurabh H Nagrecha · Svitlana Vyetrenko -
2021 Poster: On Energy-Based Models with Overparametrized Shallow Neural Networks »
Carles Domingo-Enrich · Alberto Bietti · Eric Vanden-Eijnden · Joan Bruna -
2021 Oral: On Energy-Based Models with Overparametrized Shallow Neural Networks »
Carles Domingo-Enrich · Alberto Bietti · Eric Vanden-Eijnden · Joan Bruna -
2021 Poster: Offline Contextual Bandits with Overparameterized Models »
David Brandfonbrener · William Whitney · Rajesh Ranganath · Joan Bruna -
2021 Poster: A Functional Perspective on Learning Symmetric Functions with Neural Networks »
Aaron Zweig · Joan Bruna -
2021 Spotlight: A Functional Perspective on Learning Symmetric Functions with Neural Networks »
Aaron Zweig · Joan Bruna -
2021 Spotlight: Offline Contextual Bandits with Overparameterized Models »
David Brandfonbrener · William Whitney · Rajesh Ranganath · Joan Bruna -
2020 Poster: Extra-gradient with player sampling for faster convergence in n-player games »
Samy Jelassi · Carles Domingo-Enrich · Damien Scieur · Arthur Mensch · Joan Bruna -
2019 Workshop: Theoretical Physics for Deep Learning »
Jaehoon Lee · Jeffrey Pennington · Yasaman Bahri · Max Welling · Surya Ganguli · Joan Bruna -
2019 : Opening Remarks »
Jaehoon Lee · Jeffrey Pennington · Yasaman Bahri · Max Welling · Surya Ganguli · Joan Bruna -
2019 Poster: Approximating Orthogonal Matrices with Effective Givens Factorization »
Thomas Frerix · Joan Bruna -
2019 Oral: Approximating Orthogonal Matrices with Effective Givens Factorization »
Thomas Frerix · Joan Bruna