Skip to yearly menu bar Skip to main content


Poster
in
Workshop: HiLD: High-dimensional Learning Dynamics Workshop

Hessian Inertia in Neural Networks

Xuchan Bao · Alberto Bietti · Aaron Defazio · Vivien Cabannnes


Abstract:

The Hessian matrix of a neural network provides important insight into the training dynamics. While most works have focused on the eigenvalues of the Hessian matrix, we keep track of the top eigenvectors throughout training. We uncover a surprising phenomenon, which we term ``Hessian inertia'', where the eigenvectors of the Hessian tend not to move much during training. We hypothesis that Hessian inertia is related to feature learning, and show insights through a 2D example.

Chat is not available.