Timezone: »

An Investigation into Neural Net Optimization via Hessian Eigenvalue Density
Behrooz Ghorbani · Shankar Krishnan · Ying Xiao

Tue Jun 11 04:00 PM -- 04:20 PM (PDT) @ Hall B

To understand the dynamics of training in deep neural networks, we study the evolution of the Hessian eigenvalue density throughout the optimization process. In non-batch normalized networks, we observe the rapid appearance of large isolated eigenvalues in the spectrum, along with a surprising concentration of the gradient in the corresponding eigenspaces. In a batch normalized network, these two effects are almost absent. We give a theoretical rationale to partially explain these phenomena. As part of this work, we adapt advanced tools from numerical linear algebra that allow scalable and accurate estimation of the entire Hessian spectrum of ImageNet-scale neural networks; this technique may be of independent interest in other applications.

Author Information

Behrooz Ghorbani (Stanford University)
Shankar Krishnan (Google)
Ying Xiao (Google Inc)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors