Talk
in
Workshop: Beyond first order methods in machine learning systems
Spotlight talk 3 - PyHessian: Neural Networks Through the Lens of the Hessian
Amir Gholaminejad
We present PyHessian, a new scalable framework that enables fast computation of Hessian (i.e., second-order derivative) information for deep neural networks. PyHessianenables fast computation of the top Hessian eigenvalues, the Hessian trace, and the full Hessian eigenvalue/spectral density, and supports distributed-memory execution on cloud/supercomputer systems and available as open source. We show that this framework can be used to analyze neural network models, including the topology of the loss landscape (i.e., curvature information) to gain insight into the behavior of different models/optimizers. In particular, we analyze the effect of Batch Normalization layers on the trainability of NNs. We find that Batch Normalization does not necessarily make the loss landscape smoother, especially for shallow networks, as opposed to common belief.