Timezone: »
Poster
Deep equilibrium networks are sensitive to initialization statistics
Atish Agarwala · Samuel Schoenholz
Deep equilibrium networks (DEQs) are a promising way to construct models which trade off memory for compute. However, theoretical understanding of these models is still lacking compared to traditional networks, in part because of the repeated application of a single set of weights. We show that DEQs are sensitive to the higher order statistics of the matrix families from which they are initialized. In particular, initializing with orthogonal or symmetric matrices allows for greater stability in training. This gives us a practical prescription for initializations which allow for training with a broader range of initial weight scales.
Author Information
Atish Agarwala (Google Research)
Samuel Schoenholz (Google Brain)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Spotlight: Deep equilibrium networks are sensitive to initialization statistics »
Wed. Jul 20th 02:40 -- 02:45 PM Room Ballroom 1 & 2
More from the Same Authors
-
2023 Poster: SAM operates far from home: eigenvalue regularization as a dynamical phenomenon »
Atish Agarwala · Yann Nicolas Dauphin -
2023 Poster: Second-order regression models exhibit progressive sharpening to the edge of stability »
Atish Agarwala · Fabian Pedregosa · Jeffrey Pennington -
2022 Poster: Fast Finite Width Neural Tangent Kernel »
Roman Novak · Jascha Sohl-Dickstein · Samuel Schoenholz -
2022 Spotlight: Fast Finite Width Neural Tangent Kernel »
Roman Novak · Jascha Sohl-Dickstein · Samuel Schoenholz -
2021 Poster: Learn2Hop: Learned Optimization on Rough Landscapes »
Amil Merchant · Luke Metz · Samuel Schoenholz · Ekin Dogus Cubuk -
2021 Spotlight: Learn2Hop: Learned Optimization on Rough Landscapes »
Amil Merchant · Luke Metz · Samuel Schoenholz · Ekin Dogus Cubuk -
2021 Poster: Tilting the playing field: Dynamical loss functions for machine learning »
Miguel Ruiz Garcia · Ge Zhang · Samuel Schoenholz · Andrea Liu -
2021 Oral: Tilting the playing field: Dynamical loss functions for machine learning »
Miguel Ruiz Garcia · Ge Zhang · Samuel Schoenholz · Andrea Liu -
2021 Poster: Whitening and Second Order Optimization Both Make Information in the Dataset Unusable During Training, and Can Reduce or Prevent Generalization »
Neha Wadia · Daniel Duckworth · Samuel Schoenholz · Ethan Dyer · Jascha Sohl-Dickstein -
2021 Spotlight: Whitening and Second Order Optimization Both Make Information in the Dataset Unusable During Training, and Can Reduce or Prevent Generalization »
Neha Wadia · Daniel Duckworth · Samuel Schoenholz · Ethan Dyer · Jascha Sohl-Dickstein -
2020 Poster: Disentangling Trainability and Generalization in Deep Neural Networks »
Lechao Xiao · Jeffrey Pennington · Samuel Schoenholz -
2018 Poster: Dynamical Isometry and a Mean Field Theory of RNNs: Gating Enables Signal Propagation in Recurrent Neural Networks »
Minmin Chen · Jeffrey Pennington · Samuel Schoenholz -
2018 Oral: Dynamical Isometry and a Mean Field Theory of RNNs: Gating Enables Signal Propagation in Recurrent Neural Networks »
Minmin Chen · Jeffrey Pennington · Samuel Schoenholz -
2018 Poster: Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks »
Lechao Xiao · Yasaman Bahri · Jascha Sohl-Dickstein · Samuel Schoenholz · Jeffrey Pennington -
2018 Oral: Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks »
Lechao Xiao · Yasaman Bahri · Jascha Sohl-Dickstein · Samuel Schoenholz · Jeffrey Pennington -
2017 Poster: Neural Message Passing for Quantum Chemistry »
Justin Gilmer · Samuel Schoenholz · Patrick F Riley · Oriol Vinyals · George Dahl -
2017 Talk: Neural Message Passing for Quantum Chemistry »
Justin Gilmer · Samuel Schoenholz · Patrick F Riley · Oriol Vinyals · George Dahl