Poster
in
Workshop: HiLD: High-dimensional Learning Dynamics Workshop
Hessian Inertia in Neural Networks
Xuchan Bao · Alberto Bietti · Aaron Defazio · Vivien Cabannnes
Abstract:
The Hessian matrix of a neural network provides important insight into the training dynamics. While most works have focused on the eigenvalues of the Hessian matrix, we keep track of the top eigenvectors throughout training. We uncover a surprising phenomenon, which we term ``Hessian inertia'', where the eigenvectors of the Hessian tend not to move much during training. We hypothesis that Hessian inertia is related to feature learning, and show insights through a 2D example.
Chat is not available.