Focus, Align, and Sustain: Counteracting Gradient Dilution in Incremental Object Detection
Abstract
Adapting Detection Transformers to Incremental Object Detection (IOD) poses a systemic challenge, as set-based optimization is inherently destabilized by sequential learning. In this work, we identify Gradient Dilution as the root cause of performance degradation, wherein optimization signals required to preserve old knowledge are progressively weakened. This phenomenon manifests as a cascading erosion driven by three tightly coupled factors: {\textit{Signal Dispersion}}, where foreground gradients are overwhelmed by background noise; {\textit{Assignment Drift}}, where stochastic query–target matching induces inconsistent gradient trajectories; and {\textit{Support Attrition}}, where gradients from retained samples insufficiently cover the old-class feature space, weakening decision boundaries under interference from new classes. To counteract this, we propose FAS, a unified framework that \underline{F}ocuses, \underline{A}ligns, and \underline{S}ustains gradient flow throughout incremental learning. Specifically, we introduce prior-injected queries to focus discriminative signals by filtering background interference at the source. We further propose deterministic anchor distillation to align query–target assignments, bypassing unstable bipartite matching and enforcing semantic consistency across stages. Finally, we devise manifold-support replay to sustain distributional support of old classes, counteracting representational erosion induced by continual updates. Extensive experiments show that FAS restores robust optimization dynamics and outperforms state-of-the-art methods, achieving over 5.0 AP improvement in the challenging 40+10×4 incremental setting.