Timezone: »
To understand the dynamics of training in deep neural networks, we study the evolution of the Hessian eigenvalue density throughout the optimization process. In non-batch normalized networks, we observe the rapid appearance of large isolated eigenvalues in the spectrum, along with a surprising concentration of the gradient in the corresponding eigenspaces. In a batch normalized network, these two effects are almost absent. We give a theoretical rationale to partially explain these phenomena. As part of this work, we adapt advanced tools from numerical linear algebra that allow scalable and accurate estimation of the entire Hessian spectrum of ImageNet-scale neural networks; this technique may be of independent interest in other applications.
Author Information
Behrooz Ghorbani (Stanford University)
Shankar Krishnan (Google)
Ying Xiao (Google Inc)
Related Events (a corresponding poster, oral, or spotlight)
-
2019 Poster: An Investigation into Neural Net Optimization via Hessian Eigenvalue Density »
Wed. Jun 12th 01:30 -- 04:00 AM Room Pacific Ballroom
More from the Same Authors
-
2019 Poster: An Instability in Variational Inference for Topic Models »
Behrooz Ghorbani · Hamidreza Hakim Javadi · Andrea Montanari -
2019 Oral: An Instability in Variational Inference for Topic Models »
Behrooz Ghorbani · Hamidreza Hakim Javadi · Andrea Montanari -
2018 Poster: Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron »
RJ Skerry-Ryan · Eric Battenberg · Ying Xiao · Yuxuan Wang · Daisy Stanton · Joel Shor · Ron Weiss · Robert Clark · Rif Saurous -
2018 Poster: Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis »
Yuxuan Wang · Daisy Stanton · Yu Zhang · RJ-Skerry Ryan · Eric Battenberg · Joel Shor · Ying Xiao · Ye Jia · Fei Ren · Rif Saurous -
2018 Oral: Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis »
Yuxuan Wang · Daisy Stanton · Yu Zhang · RJ-Skerry Ryan · Eric Battenberg · Joel Shor · Ying Xiao · Ye Jia · Fei Ren · Rif Saurous -
2018 Oral: Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron »
RJ Skerry-Ryan · Eric Battenberg · Ying Xiao · Yuxuan Wang · Daisy Stanton · Joel Shor · Ron Weiss · Robert Clark · Rif Saurous