Timezone: »
The Neural Tangent Kernel (NTK), defined as the outer product of the neural network (NN) Jacobians, has emerged as a central object of study in deep learning. In the infinite width limit, the NTK can sometimes be computed analytically and is useful for understanding training and generalization of NN architectures. At finite widths, the NTK is also used to better initialize NNs, compare the conditioning across models, perform architecture search, and do meta-learning. Unfortunately, the finite width NTK is notoriously expensive to compute, which severely limits its practical utility. We perform the first in-depth analysis of the compute and memory requirements for NTK computation in finite width networks. Leveraging the structure of neural networks, we further propose two novel algorithms that change the exponent of the compute and memory requirements of the finite width NTK, dramatically improving efficiency.Our algorithms can be applied in a black box fashion to any differentiable function, including those implementing neural networks.We open-source our implementations within the Neural Tangents package at https://github.com/google/neural-tangents.
Author Information
Roman Novak (Google Brain)
Jascha Sohl-Dickstein (Google Brain)
Samuel Schoenholz (Google Brain)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Poster: Fast Finite Width Neural Tangent Kernel »
Thu. Jul 21st through Fri the 22nd Room Hall E #328
More from the Same Authors
-
2022 Poster: Deep equilibrium networks are sensitive to initialization statistics »
Atish Agarwala · Samuel Schoenholz -
2022 Poster: Wide Bayesian neural networks have a simple weight posterior: theory and accelerated sampling »
Jiri Hron · Roman Novak · Jeffrey Pennington · Jascha Sohl-Dickstein -
2022 Spotlight: Wide Bayesian neural networks have a simple weight posterior: theory and accelerated sampling »
Jiri Hron · Roman Novak · Jeffrey Pennington · Jascha Sohl-Dickstein -
2022 Spotlight: Deep equilibrium networks are sensitive to initialization statistics »
Atish Agarwala · Samuel Schoenholz -
2021 Poster: Learn2Hop: Learned Optimization on Rough Landscapes »
Amil Merchant · Luke Metz · Samuel Schoenholz · Ekin Dogus Cubuk -
2021 Spotlight: Learn2Hop: Learned Optimization on Rough Landscapes »
Amil Merchant · Luke Metz · Samuel Schoenholz · Ekin Dogus Cubuk -
2021 Poster: Tilting the playing field: Dynamical loss functions for machine learning »
Miguel Ruiz Garcia · Ge Zhang · Samuel Schoenholz · Andrea Liu -
2021 Oral: Tilting the playing field: Dynamical loss functions for machine learning »
Miguel Ruiz Garcia · Ge Zhang · Samuel Schoenholz · Andrea Liu -
2021 Poster: Whitening and Second Order Optimization Both Make Information in the Dataset Unusable During Training, and Can Reduce or Prevent Generalization »
Neha Wadia · Daniel Duckworth · Samuel Schoenholz · Ethan Dyer · Jascha Sohl-Dickstein -
2021 Poster: Unbiased Gradient Estimation in Unrolled Computation Graphs with Persistent Evolution Strategies »
Paul Vicol · Luke Metz · Jascha Sohl-Dickstein -
2021 Spotlight: Whitening and Second Order Optimization Both Make Information in the Dataset Unusable During Training, and Can Reduce or Prevent Generalization »
Neha Wadia · Daniel Duckworth · Samuel Schoenholz · Ethan Dyer · Jascha Sohl-Dickstein -
2021 Oral: Unbiased Gradient Estimation in Unrolled Computation Graphs with Persistent Evolution Strategies »
Paul Vicol · Luke Metz · Jascha Sohl-Dickstein -
2020 Poster: Infinite attention: NNGP and NTK for deep attention networks »
Jiri Hron · Yasaman Bahri · Jascha Sohl-Dickstein · Roman Novak -
2020 Poster: Disentangling Trainability and Generalization in Deep Neural Networks »
Lechao Xiao · Jeffrey Pennington · Samuel Schoenholz -
2019 : Poster discussion »
Roman Novak · Maxime Gabella · Frederic Dreyer · Siavash Golkar · Anh Tong · Irina Higgins · Mirco Milletari · Joe Antognini · Sebastian Goldt · Adín Ramírez Rivera · Roberto Bondesan · Ryo Karakida · Remi Tachet des Combes · Michael Mahoney · Nicholas Walker · Stanislav Fort · Samuel Smith · Rohan Ghosh · Aristide Baratin · Diego Granziol · Stephen Roberts · Dmitry Vetrov · Andrew Wilson · César Laurent · Valentin Thomas · Simon Lacoste-Julien · Dar Gilboa · Daniel Soudry · Anupam Gupta · Anirudh Goyal · Yoshua Bengio · Erich Elsen · Soham De · Stanislaw Jastrzebski · Charles H Martin · Samira Shabanian · Aaron Courville · Shorato Akaho · Lenka Zdeborova · Ethan Dyer · Maurice Weiler · Pim de Haan · Taco Cohen · Max Welling · Ping Luo · zhanglin peng · Nasim Rahaman · Loic Matthey · Danilo J. Rezende · Jaesik Choi · Kyle Cranmer · Lechao Xiao · Jaehoon Lee · Yasaman Bahri · Jeffrey Pennington · Greg Yang · Jiri Hron · Jascha Sohl-Dickstein · Guy Gur-Ari -
2019 : Understanding overparameterized neural networks »
Jascha Sohl-Dickstein -
2019 : Poster spotlights »
Roman Novak · Frederic Dreyer · Siavash Golkar · Irina Higgins · Joe Antognini · Ryo Karakida · Rohan Ghosh -
2019 Poster: Understanding and correcting pathologies in the training of learned optimizers »
Luke Metz · Niru Maheswaranathan · Jeremy Nixon · Daniel Freeman · Jascha Sohl-Dickstein -
2019 Poster: Guided evolutionary strategies: augmenting random search with surrogate gradients »
Niru Maheswaranathan · Luke Metz · George Tucker · Dami Choi · Jascha Sohl-Dickstein -
2019 Oral: Guided evolutionary strategies: augmenting random search with surrogate gradients »
Niru Maheswaranathan · Luke Metz · George Tucker · Dami Choi · Jascha Sohl-Dickstein -
2019 Oral: Understanding and correcting pathologies in the training of learned optimizers »
Luke Metz · Niru Maheswaranathan · Jeremy Nixon · Daniel Freeman · Jascha Sohl-Dickstein -
2019 Poster: The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study »
Daniel Park · Jascha Sohl-Dickstein · Quoc Le · Samuel Smith -
2019 Oral: The Effect of Network Width on Stochastic Gradient Descent and Generalization: an Empirical Study »
Daniel Park · Jascha Sohl-Dickstein · Quoc Le · Samuel Smith -
2018 Poster: Dynamical Isometry and a Mean Field Theory of RNNs: Gating Enables Signal Propagation in Recurrent Neural Networks »
Minmin Chen · Jeffrey Pennington · Samuel Schoenholz -
2018 Oral: Dynamical Isometry and a Mean Field Theory of RNNs: Gating Enables Signal Propagation in Recurrent Neural Networks »
Minmin Chen · Jeffrey Pennington · Samuel Schoenholz -
2018 Poster: Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks »
Lechao Xiao · Yasaman Bahri · Jascha Sohl-Dickstein · Samuel Schoenholz · Jeffrey Pennington -
2018 Oral: Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks »
Lechao Xiao · Yasaman Bahri · Jascha Sohl-Dickstein · Samuel Schoenholz · Jeffrey Pennington -
2017 Poster: Neural Message Passing for Quantum Chemistry »
Justin Gilmer · Samuel Schoenholz · Patrick F Riley · Oriol Vinyals · George Dahl -
2017 Talk: Neural Message Passing for Quantum Chemistry »
Justin Gilmer · Samuel Schoenholz · Patrick F Riley · Oriol Vinyals · George Dahl -
2017 Poster: Input Switched Affine Networks: An RNN Architecture Designed for Interpretability »
Jakob Foerster · Justin Gilmer · Jan Chorowski · Jascha Sohl-Dickstein · David Sussillo -
2017 Poster: Learned Optimizers that Scale and Generalize »
Olga Wichrowska · Niru Maheswaranathan · Matthew Hoffman · Sergio Gómez Colmenarejo · Misha Denil · Nando de Freitas · Jascha Sohl-Dickstein -
2017 Talk: Input Switched Affine Networks: An RNN Architecture Designed for Interpretability »
Jakob Foerster · Justin Gilmer · Jan Chorowski · Jascha Sohl-Dickstein · David Sussillo -
2017 Poster: On the Expressive Power of Deep Neural Networks »
Maithra Raghu · Ben Poole · Surya Ganguli · Jon Kleinberg · Jascha Sohl-Dickstein -
2017 Talk: Learned Optimizers that Scale and Generalize »
Olga Wichrowska · Niru Maheswaranathan · Matthew Hoffman · Sergio Gómez Colmenarejo · Misha Denil · Nando de Freitas · Jascha Sohl-Dickstein -
2017 Talk: On the Expressive Power of Deep Neural Networks »
Maithra Raghu · Ben Poole · Surya Ganguli · Jon Kleinberg · Jascha Sohl-Dickstein