Timezone: »

Hessian Inertia in Neural Networks
Xuchan Bao · Alberto Bietti · Aaron Defazio · Vivien Cabannnes

The Hessian matrix of a neural network provides important insight into the training dynamics. While most works have focused on the eigenvalues of the Hessian matrix, we keep track of the top eigenvectors throughout training. We uncover a surprising phenomenon, which we term ``Hessian inertia'', where the eigenvectors of the Hessian tend not to move much during training. We hypothesis that Hessian inertia is related to feature learning, and show insights through a 2D example.

Author Information

Xuchan Bao (University of Toronto)
Alberto Bietti (Meta AI)
Aaron Defazio (FAIR - Meta AI)
Vivien Cabannnes (Meta AI)

More from the Same Authors