CONGA:Confidence-and-Gradient-Aware Learning Rate Schedule for Test Time Adaptation
Abstract
Test-time adaptation (TTA) adapts pretrained models to test data on-the-fly. Current TTA methods have focused on what to adapt: lightweight domain-aware components (prompts, normalization statistics) updated with consistency-aware self-supervised losses. This work investigates the more fundamental yet underexplored optimization process, providing insights and guidelines on how to appropriately update models for TTA. By analyzing the optimization error during TTA, we identify a pivotal stability-plasticity trade-off: the model should adapt to novel distributions while retaining learned knowledge, which motivates our design of a CONfidence-and-Gradient-Aware scheduler (CONGA) to constrain model learning rate (LR) within an adaptive exploration interval. For each iteration, the lower bound encourages model exploration on informative confident samples, while the upper bound prevents aggressive overfitting to noisy optimization gradients. Based on our theoretical findings, an adaptation-progress-conditioned cosine decay function decides the specific LR within the interval. As an LR scheduler, CONGA is naturally applicable on existing TTA methods as a plug-in module, introducing little computation overheads. Extensive experiments and analysis demonstrate the superiority and validness of CONGA.