Unifying Low Dimensional Spectra in Deep Learning
Abstract
Low-dimensional structures appear ubiquitously in the eigenspectra of deep-learning matrices in classification networks trained in the overparameterized regime. While theoretical advances have aimed to explain this phenomenology, they typically succeed only in capturing subsets of the full behavior or rely on assumptions that cannot hold in practice. In this work, we provide an analytic explanation for the bulk–outlier structure of several canonical deep-learning matrices, including the Hessian, gradients, and weights. We achieve this using unconstrained feature models (UFMs), a now-common tool for studying the emergence of deep neural collapse (DNC). We show that DNC is the source of these low-dimensional eigenspectra: in each case, the eigenvalues and eigenvectors can be constructed from feature means, the characterizing objects of DNC. This provides a unifying analytic explanation for a wide range of spectral phenomena in deep learning and goes beyond empirical characterizations—which typically focus on eigenvalues—by providing a detailed analysis of eigenvectors. We prove that our results hold for both linear and ReLU networks and provide numerical validation in both the modeling context and standard deep-network architectures on canonical datasets.