Timezone: »
Continual learning---learning new tasks in sequence while maintaining performance on old tasks---remains particularly challenging for artificial neural networks. Surprisingly, the amount of forgetting does not increase with the dissimilarity between the learned tasks, but appears to be worst in an intermediate similarity regime.In this paper we theoretically analyse both a synthetic teacher-student framework and a real data setup to provide an explanation of this phenomenon that we name Maslow's Hammer hypothesis. Our analysis reveals the presence of a trade-off between node activation and node re-use that results in worst forgetting in the intermediate regime. Using this understanding we reinterpret popular algorithmic interventions for catastrophic interference in terms of this trade-off, and identify the regimes in which they are most effective.
Author Information
Sebastian Lee (Imperial College / UCL)
ML PhD Student
Stefano Sarao Mannelli (Gatsby & SWC | UCL)
Claudia Clopath (Imperial College London)
Sebastian Goldt (International School of Advanced Studies (SISSA))
I'm an assistant professor working on theories of learning in neural networks.
Andrew Saxe (UCL)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Poster: Maslow's Hammer in Catastrophic Forgetting: Node Re-Use vs. Node Activation »
Wed. Jul 20th through Thu the 21st Room Hall E #1434
More from the Same Authors
-
2023 : Local learning in recurrent networks modelling motor cortex »
Claudia Clopath -
2023 Poster: Neural networks trained with SGD learn distributions of increasing complexity »
Maria Refinetti · Alessandro Ingrosso · Sebastian Goldt -
2022 Poster: The dynamics of representation learning in shallow, non-linear autoencoders »
Maria Refinetti · Sebastian Goldt -
2022 Poster: The Neural Race Reduction: Dynamics of Abstraction in Gated Networks »
Andrew Saxe · Shagun Sodhani · Sam Lewallen -
2022 Spotlight: The Neural Race Reduction: Dynamics of Abstraction in Gated Networks »
Andrew Saxe · Shagun Sodhani · Sam Lewallen -
2022 Spotlight: The dynamics of representation learning in shallow, non-linear autoencoders »
Maria Refinetti · Sebastian Goldt -
2021 Poster: Classifying high-dimensional Gaussian mixtures: Where kernel methods fail and neural networks succeed »
Maria Refinetti · Sebastian Goldt · FLORENT KRZAKALA · Lenka Zdeborova -
2021 Poster: Align, then memorise: the dynamics of learning with feedback alignment »
Maria Refinetti · Stéphane d'Ascoli · Ruben Ohana · Sebastian Goldt -
2021 Poster: Spectral Normalisation for Deep Reinforcement Learning: An Optimisation Perspective »
Florin Gogianu · Tudor Berariu · Mihaela Rosca · Claudia Clopath · Lucian Busoniu · Razvan Pascanu -
2021 Spotlight: Align, then memorise: the dynamics of learning with feedback alignment »
Maria Refinetti · Stéphane d'Ascoli · Ruben Ohana · Sebastian Goldt -
2021 Spotlight: Classifying high-dimensional Gaussian mixtures: Where kernel methods fail and neural networks succeed »
Maria Refinetti · Sebastian Goldt · FLORENT KRZAKALA · Lenka Zdeborova -
2021 Spotlight: Spectral Normalisation for Deep Reinforcement Learning: An Optimisation Perspective »
Florin Gogianu · Tudor Berariu · Mihaela Rosca · Claudia Clopath · Lucian Busoniu · Razvan Pascanu -
2021 Poster: Continual Learning in the Teacher-Student Setup: Impact of Task Similarity »
Sebastian Lee · Sebastian Goldt · Andrew Saxe -
2021 Spotlight: Continual Learning in the Teacher-Student Setup: Impact of Task Similarity »
Sebastian Lee · Sebastian Goldt · Andrew Saxe -
2020 : Invited Talk: Claudia Clopath "Continual learning though consolidation – a neuroscience angle" »
Claudia Clopath -
2019 : Andrew Saxe: Intriguing phenomena in training and generalization dynamics of deep networks »
Andrew Saxe -
2019 : Poster discussion »
Roman Novak · Maxime Gabella · Frederic Dreyer · Siavash Golkar · Anh Tong · Irina Higgins · Mirco Milletari · Joe Antognini · Sebastian Goldt · Adín Ramírez Rivera · Roberto Bondesan · Ryo Karakida · Remi Tachet des Combes · Michael Mahoney · Nicholas Walker · Stanislav Fort · Samuel Smith · Rohan Ghosh · Aristide Baratin · Diego Granziol · Stephen Roberts · Dmitry Vetrov · Andrew Wilson · César Laurent · Valentin Thomas · Simon Lacoste-Julien · Dar Gilboa · Daniel Soudry · Anupam Gupta · Anirudh Goyal · Yoshua Bengio · Erich Elsen · Soham De · Stanislaw Jastrzebski · Charles H Martin · Samira Shabanian · Aaron Courville · Shorato Akaho · Lenka Zdeborova · Ethan Dyer · Maurice Weiler · Pim de Haan · Taco Cohen · Max Welling · Ping Luo · zhanglin peng · Nasim Rahaman · Loic Matthey · Danilo J. Rezende · Jaesik Choi · Kyle Cranmer · Lechao Xiao · Jaehoon Lee · Yasaman Bahri · Jeffrey Pennington · Greg Yang · Jiri Hron · Jascha Sohl-Dickstein · Guy Gur-Ari -
2019 : Analyzing the dynamics of online learning in over-parameterized two-layer neural networks »
Sebastian Goldt -
2019 Poster: Passed & Spurious: Descent Algorithms and Local Minima in Spiked Matrix-Tensor Models »
Stefano Sarao Mannelli · Florent Krzakala · Pierfrancesco Urbani · Lenka Zdeborova -
2019 Poster: Policy Consolidation for Continual Reinforcement Learning »
Christos Kaplanis · Murray Shanahan · Claudia Clopath -
2019 Oral: Policy Consolidation for Continual Reinforcement Learning »
Christos Kaplanis · Murray Shanahan · Claudia Clopath -
2019 Oral: Passed & Spurious: Descent Algorithms and Local Minima in Spiked Matrix-Tensor Models »
Stefano Sarao Mannelli · Florent Krzakala · Pierfrancesco Urbani · Lenka Zdeborova -
2018 Poster: Continual Reinforcement Learning with Complex Synapses »
Christos Kaplanis · Murray Shanahan · Claudia Clopath -
2018 Oral: Continual Reinforcement Learning with Complex Synapses »
Christos Kaplanis · Murray Shanahan · Claudia Clopath -
2017 Poster: Hierarchy Through Composition with Multitask LMDPs »
Andrew Saxe · Adam Earle · Benjamin Rosman -
2017 Talk: Hierarchy Through Composition with Multitask LMDPs »
Andrew Saxe · Adam Earle · Benjamin Rosman