Timezone: »
Implicit Acceleration and Feature Learning in Infinitely Wide Neural Networks with Bottlenecks
Etai Littwin · Omid Saremi · Shuangfei Zhai · Vimal Thilak · Hanlin Goh · Joshua M Susskind · Greg Yang
We analyze the learning dynamics of infinitely wide neural networks with a finite sized bottleneck. Unlike the neural tangent kernel limit, a bottleneck in an otherwise infinite width network allows data dependent feature learning in its bottleneck representation. We empirically show that a single bottleneck in infinite networks dramatically accelerates training when compared to purely infinite networks, with an improved overall performance. We discuss the acceleration phenomena by drawing similarities to infinitely wide deep linear models, where the acceleration effect of a bottleneck can be understood theoretically.
Author Information
Etai Littwin (Apple)
Omid Saremi (Apple Inc.)
Shuangfei Zhai (Apple)
Vimal Thilak (Apple)
Hanlin Goh (Apple)
Joshua M Susskind (Apple, Inc.)
Greg Yang (Microsoft Research)
More from the Same Authors
-
2021 : Implicit Greedy Rank Learning in Autoencoders via Overparameterized Linear Networks »
Shih-Yu Sun · Vimal Thilak · Etai Littwin · Omid Saremi · Joshua M Susskind -
2021 : Feature Learning in Infinite-Width Neural Networks »
Greg Yang · Edward Hu -
2022 Poster: Efficient Representation Learning via Adaptive Context Pooling »
Chen Huang · Walter Talbott · Navdeep Jaitly · Joshua M Susskind -
2022 Spotlight: Efficient Representation Learning via Adaptive Context Pooling »
Chen Huang · Walter Talbott · Navdeep Jaitly · Joshua M Susskind -
2022 Poster: Position Prediction as an Effective Pretraining Strategy »
Shuangfei Zhai · Navdeep Jaitly · Jason Ramapuram · Dan Busbridge · Tatiana Likhomanenko · Joseph Cheng · Walter Talbott · Chen Huang · Hanlin Goh · Joshua M Susskind -
2022 Spotlight: Position Prediction as an Effective Pretraining Strategy »
Shuangfei Zhai · Navdeep Jaitly · Jason Ramapuram · Dan Busbridge · Tatiana Likhomanenko · Joseph Cheng · Walter Talbott · Chen Huang · Hanlin Goh · Joshua M Susskind -
2021 : Feature Learning in Infinite-Width Neural Networks »
Greg Yang · Edward Hu -
2021 Poster: Tensor Programs IV: Feature Learning in Infinite-Width Neural Networks »
Greg Yang · Edward Hu -
2021 Poster: Tensor Programs IIb: Architectural Universality Of Neural Tangent Kernel Training Dynamics »
Greg Yang · Etai Littwin -
2021 Spotlight: Tensor Programs IIb: Architectural Universality Of Neural Tangent Kernel Training Dynamics »
Greg Yang · Etai Littwin -
2021 Spotlight: Tensor Programs IV: Feature Learning in Infinite-Width Neural Networks »
Greg Yang · Edward Hu -
2021 Poster: Uncertainty Weighted Actor-Critic for Offline Reinforcement Learning »
Yue Wu · Shuangfei Zhai · Nitish Srivastava · Joshua M Susskind · Jian Zhang · Ruslan Salakhutdinov · Hanlin Goh -
2021 Spotlight: Uncertainty Weighted Actor-Critic for Offline Reinforcement Learning »
Yue Wu · Shuangfei Zhai · Nitish Srivastava · Joshua M Susskind · Jian Zhang · Ruslan Salakhutdinov · Hanlin Goh -
2020 Poster: Equivariant Neural Rendering »
Emilien Dupont · Miguel Angel Bautista Martin · Alex Colburn · Aditya Sankar · Joshua M Susskind · Qi Shan -
2020 Poster: Randomized Smoothing of All Shapes and Sizes »
Greg Yang · Tony Duan · J. Edward Hu · Hadi Salman · Ilya Razenshteyn · Jerry Li -
2019 : Poster discussion »
Roman Novak · Maxime Gabella · Frederic Dreyer · Siavash Golkar · Anh Tong · Irina Higgins · Mirco Milletari · Joe Antognini · Sebastian Goldt · Adín Ramírez Rivera · Roberto Bondesan · Ryo Karakida · Remi Tachet des Combes · Michael Mahoney · Nicholas Walker · Stanislav Fort · Samuel Smith · Rohan Ghosh · Aristide Baratin · Diego Granziol · Stephen Roberts · Dmitry Vetrov · Andrew Wilson · César Laurent · Valentin Thomas · Simon Lacoste-Julien · Dar Gilboa · Daniel Soudry · Anupam Gupta · Anirudh Goyal · Yoshua Bengio · Erich Elsen · Soham De · Stanislaw Jastrzebski · Charles H Martin · Samira Shabanian · Aaron Courville · Shorato Akaho · Lenka Zdeborova · Ethan Dyer · Maurice Weiler · Pim de Haan · Taco Cohen · Max Welling · Ping Luo · zhanglin peng · Nasim Rahaman · Loic Matthey · Danilo J. Rezende · Jaesik Choi · Kyle Cranmer · Lechao Xiao · Jaehoon Lee · Yasaman Bahri · Jeffrey Pennington · Greg Yang · Jiri Hron · Jascha Sohl-Dickstein · Guy Gur-Ari -
2019 Poster: Addressing the Loss-Metric Mismatch with Adaptive Loss Alignment »
Chen Huang · Shuangfei Zhai · Walter Talbott · Miguel Angel Bautista Martin · Shih-Yu Sun · Carlos Guestrin · Joshua M Susskind -
2019 Oral: Addressing the Loss-Metric Mismatch with Adaptive Loss Alignment »
Chen Huang · Shuangfei Zhai · Walter Talbott · Miguel Angel Bautista Martin · Shih-Yu Sun · Carlos Guestrin · Joshua M Susskind