Beyond Buffer Limits: Energy-Based Data Reassembly for Continual Learning
Abstract
Continual learning (CL) aims to acquire new knowledge from a non-stationary data stream while retaining performance on previously learned tasks. Memory-based replay methods mitigate catastrophic forgetting by storing and revisiting past samples, but their effectiveness is fundamentally constrained by limited memory capacity, as each stored example represents only a single data instance. In this work, we propose data reassembly for CL, a new paradigm that significantly increases memory efficiency by reassembling composite replay samples from existing training data. Instead of storing raw training examples, we partition the current task training data into elementary patches and dynamically reassemble them into coherent replay instances through an energy-based optimization framework. The proposed objective jointly enforces semantic compatibility with target labels and global consistency among assembled patches. To make this optimization tractable, we derive an efficient variational inference algorithm that constructs a compact yet diverse set of reassembled samples for replay. Extensive theoretical analysis and experiments across multiple CL benchmarks demonstrate that data reassembly consistently outperforms existing memory-based approaches, achieving stronger retention of past knowledge while maintaining competitive computational efficiency.