Timezone: »

 
Does Continual Learning Equally Forget All Parameters?
Haiyan Zhao · Tianyi Zhou · Guodong Long · Jing Jiang · Chengqi Zhang

Fri Jul 22 08:00 AM -- 08:15 AM (PDT) @

Continual learning (CL) on neural networks suffers from catastrophic forgetting due to the distribution or task shift. In this paper, we study which parts of neural nets are more prone to forgetting by investigating their training dynamics during CL. We discover that only a few modules (e.g., batch-norm, last layer, earlier convolutional layers) are more task-specific and sensitively alters between tasks, while others can be shared across tasks as common knowledge. Hence, we attribute forgetting mainly to the former and find that finetuning them on only a small buffer at the end of any CL method can bring non-trivial improvement.Due to their few parameters, such Forgetting Prioritized Finetuning (FPF)'' is efficient and only requires a small buffer to retain the previous tasks.We further develop an even simpler replay-free method that applies FPF k-times during CL to replace the costly every-step replay. Surprisingly, thisk-FPF'' performs comparably to FPF and outperforms the state-of-the-art CL methods but significantly reduces their computational overhead and cost. In experiments on several benchmarks of class- and domain-incremental CL, FPF consistently improves existing CL methods by a large margin and k-FPF further excels on the efficiency without degrading the accuracy.

Author Information

Haiyan Zhao (University of Technology Sydney)
Tianyi Zhou (University of Washington)
Tianyi Zhou

Tianyi Zhou is a tenure-track assistant professor of Computer Science and UMIACS at the University of Maryland, College Park. He received his Ph.D. from the University of Washington, Seattle. His research interests are machine learning, optimization, and natural language processing. His recent works focus on curriculum learning, hybrid human-artificial intelligence, trustworthy and robust AI, plasticity-stability trade-off in ML, large language and multi-modality models, reinforcement learning, federated learning, and meta-learning. He has published ~90 papers at NeurIPS, ICML, ICLR, AISTATS, ACL, EMNLP, NAACL, COLING, CVPR, KDD, ICDM, AAAI, IJCAI, ISIT, Machine Learning (Springer), IEEE TIP/TNNLS/TKDE, etc. He is the recipient of the Best Student Paper Award at ICDM 2013 and the 2020 IEEE TCSC Most Influential Paper Award. He served as an SPC member or area chair in AAAI, IJCAI, KDD, WACV, etc. Tianyi was a visiting research scientist at Google and a research intern at Microsoft Research Redmond and Yahoo! Labs.

Guodong Long (University of Technology Sydney)
Jing Jiang (University of Technology Sydney)
Chengqi Zhang (University of Technology Sydney)

More from the Same Authors