Poster

TiME: Test-Time Mixture-of-Experts Routing via Asymmetric CO-Optimal Transport for Continual Test-Time Adaptation

Tianlun Liu ⋅ Zhiliang Tian ⋅ Zhen Huang ⋅ Tianle Liu ⋅ Xingzhi Zhou ⋅ Feng Liu ⋅ Dongsheng Li

Abstract

Large language models usually face continuous domain shifts during testing, which degrade performance on unseen shifting domains. So, researchers propose continual test-time adaptation (CTTA) to adapt to evolving testing domains while preserving knowledge of previous domains, making adaptability-stability (A-S) balance. Existing CTTA methods are constrained by dense base models that encode knowledge from all domains into a global model, hardly achieving the A-S balance. We observe that the model sparsity of mixture-of-experts (MoE) models is better for achieving A–S balance than dense models. In CTTA, however, MoE faces difficulty in (1) correctly routing samples from unseen shifting domains and (2) capturing domain-level shifts. In this paper, we propose test-time mixture-of-experts routing (TiME) via asymmetric co-optimal transport (As-COOT): we model MoE routing in CTTA as a test-time allocation problem via COOT. To ensure reliable routing, we propose a semantic space alignment to align sample-expert distributions via bidirectional contrastive learning. To address COOT’s limitations in CTTA, we propose As-COOT, relaxing sample-side constraints while enforcing expert-side constraints to ensure noise robustness and balance expert load. Experiments show TiME outperforms baselines. Code is: anonymous.4open.science/r/As-COOT-78FF