MoCL: Metabolic Optimization for Curvature-Aware Continual Learning
Abstract
Continual learning requires models to mitigate catastrophic forgetting of prior knowledge while learning a sequence of tasks. Although existing methods based on orthogonal projection prevent interference by constraining parameter updates, they tend to limit plasticity as the task sequence progresses. The reliance on linear approximation further causes the projected gradients to deviate from the nonlinear manifold. To address these issues, we propose Metabolic Optimization for Continual Learning (MoCL), a rehearsal-free framework that strikes a balance between stability and plasticity. To capture the geometric manifold of prior knowledge, MoCL introduces a factorized subspace approximation that avoids expensive explicit matrix inversion. Given the heavy-tailed distribution of the Fisher Information Matrix, we employ a metabolic gating based on Tsallis entropy to suppress updates that conflict with historical knowledge. Theoretical and empirical analyses reveal that MoCL enables the model to converge to a shared low-loss region across sequential tasks. Extensive experimental results across multiple benchmarks demonstrate that MoCL outperforms state-of-the-art methods in both classification performance and efficiency.