Skip to yearly menu bar Skip to main content


Spotlight
in
Workshop: Dynamic Neural Networks

Does Continual Learning Equally Forget All Parameters?

Haiyan Zhao · Tianyi Zhou · Guodong Long · Jing Jiang · Chengqi Zhang


Abstract:

Continual learning (CL) on neural networks suffers from catastrophic forgetting due to the distribution or task shift. In this paper, we study which parts of neural nets are more prone to forgetting by investigating their training dynamics during CL. We discover that only a few modules (e.g., batch-norm, last layer, earlier convolutional layers) are more task-specific and sensitively alters between tasks, while others can be shared across tasks as common knowledge. Hence, we attribute forgetting mainly to the former and find that finetuning them on only a small buffer at the end of any CL method can bring non-trivial improvement.Due to their few parameters, such Forgetting Prioritized Finetuning (FPF)'' is efficient and only requires a small buffer to retain the previous tasks.We further develop an even simpler replay-free method that applies FPF k-times during CL to replace the costly every-step replay. Surprisingly, thisk-FPF'' performs comparably to FPF and outperforms the state-of-the-art CL methods but significantly reduces their computational overhead and cost. In experiments on several benchmarks of class- and domain-incremental CL, FPF consistently improves existing CL methods by a large margin and k-FPF further excels on the efficiency without degrading the accuracy.

Chat is not available.