Timezone: »
Poster
Geometry of Neural Network Loss Surfaces via Random Matrix Theory
Jeffrey Pennington · Yasaman Bahri
Understanding the geometry of neural network loss surfaces is important for the development of improved optimization algorithms and for building a theoretical understanding of why deep learning works. In this paper, we study the geometry in terms of the distribution of eigenvalues of the Hessian matrix at critical points of varying energy. We introduce an analytical framework and a set of tools from random matrix theory that allow us to compute an approximation of this distribution under a set of simplifying assumptions. The shape of the spectrum depends strongly on the energy and another key parameter, $\phi$, which measures the ratio of parameters to data points. Our analysis predicts and numerical simulations support that for critical points of small index, the number of negative eigenvalues scales like the 3/2 power of the energy. We leave as an open problem an explanation for our observation that, in the context of a certain memorization task, the energy of minimizers is wellapproximated by the function $1/2(1\phi)^2$.
Author Information
Jeffrey Pennington (Google Brain)
Yasaman Bahri (Google Brain)
Related Events (a corresponding poster, oral, or spotlight)

2017 Talk: Geometry of Neural Network Loss Surfaces via Random Matrix Theory »
Mon Aug 7th 01:24  01:42 AM Room C4.8
More from the Same Authors

2020 Poster: The Neural Tangent Kernel in High Dimensions: Triple Descent and a MultiScale Theory of Generalization »
Ben Adlam · Jeffrey Pennington 
2020 Poster: Infinite attention: NNGP and NTK for deep attention networks »
Jiri Hron · Yasaman Bahri · Jascha SohlDickstein · Roman Novak 
2020 Poster: Disentangling Trainability and Generalization in Deep Neural Networks »
Lechao Xiao · Jeffrey Pennington · Samuel Schoenholz 
2019 Workshop: Theoretical Physics for Deep Learning »
Jaehoon Lee · Jeffrey Pennington · Yasaman Bahri · Max Welling · Surya Ganguli · Joan Bruna 
2018 Poster: Dynamical Isometry and a Mean Field Theory of RNNs: Gating Enables Signal Propagation in Recurrent Neural Networks »
Minmin Chen · Jeffrey Pennington · Samuel Schoenholz 
2018 Oral: Dynamical Isometry and a Mean Field Theory of RNNs: Gating Enables Signal Propagation in Recurrent Neural Networks »
Minmin Chen · Jeffrey Pennington · Samuel Schoenholz 
2018 Poster: Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000Layer Vanilla Convolutional Neural Networks »
Lechao Xiao · Yasaman Bahri · Jascha SohlDickstein · Samuel Schoenholz · Jeffrey Pennington 
2018 Oral: Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000Layer Vanilla Convolutional Neural Networks »
Lechao Xiao · Yasaman Bahri · Jascha SohlDickstein · Samuel Schoenholz · Jeffrey Pennington