Timezone: »
We show that the input correlation matrix of typical classification datasets has an eigenspectrum where, after a sharp initial drop, a large number of small eigenvalues are distributed uniformly over an exponentially large range. This structure is mirrored in a network trained on this data: we show that the Hessian and the Fisher Information Matrix (FIM) have eigenvalues that are spread uniformly over exponentially large ranges. We call such eigenspectra ``sloppy'' because sets of weights corresponding to small eigenvalues can be changed by large magnitudes without affecting the loss. Networks trained on atypical datasets with non-sloppy inputs do not share these traits and deep networks trained on such datasets generalize poorly. Inspired by this, we study the hypothesis that sloppiness of inputs aids generalization in deep networks. We show that if the Hessian is sloppy, we can compute non-vacuous PAC-Bayes generalization bounds analytically. By exploiting our empirical observation that training predominantly takes place in the non-sloppy subspace of the FIM, we develop data-distribution dependent PAC-Bayes priors that lead to accurate generalization bounds using numerical optimization.
Author Information
Rubing Yang (University of Pennsylvania)
Jialin Mao (University of Pennsylvania)
Pratik Chaudhari (University of Pennsylvania, Amazon Web Services)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Poster: Does the Data Induce Capacity Control in Deep Learning? »
Thu. Jul 21st through Fri the 22nd Room Hall E #234
More from the Same Authors
-
2021 : Continuous Doubly Constrained Batch Reinforcement Learning »
Rasool Fakoor · Jonas Mueller · Kavosh Asadi · Pratik Chaudhari · Alex Smola -
2023 Poster: The Value of Out-of-Distribution Data »
Ashwin De Silva · Rahul Ramesh · Carey Priebe · Pratik Chaudhari · Joshua Vogelstein -
2023 Poster: A Picture of the Space of Typical Learnable Tasks »
Rahul Ramesh · Jialin Mao · Itay Griniasty · Rubing Yang · Han Kheng Teoh · Mark Transtrum · James Sethna · Pratik Chaudhari -
2023 Workshop: New Frontiers in Learning, Control, and Dynamical Systems »
Valentin De Bortoli · Charlotte Bunne · Guan-Horng Liu · Tianrong Chen · Maxim Raginsky · Pratik Chaudhari · Melanie Zeilinger · Animashree Anandkumar -
2022 Poster: Deep Reference Priors: What is the best way to pretrain a model? »
Yansong Gao · Rahul Ramesh · Pratik Chaudhari -
2022 Spotlight: Deep Reference Priors: What is the best way to pretrain a model? »
Yansong Gao · Rahul Ramesh · Pratik Chaudhari -
2021 Poster: An Information-Geometric Distance on the Space of Tasks »
Yansong Gao · Pratik Chaudhari -
2021 Spotlight: An Information-Geometric Distance on the Space of Tasks »
Yansong Gao · Pratik Chaudhari -
2020 Poster: A Free-Energy Principle for Representation Learning »
Yansong Gao · Pratik Chaudhari