Poster Wed, Jul 8, 2026 • 2:30 PM – 4:15 PM KST Coex: HALL A

Return of Frustratingly Easy Unsupervised Video Domain Adaptation

Pengfei Wei ⋅ Yiqun Sun ⋅ Zhiqiang Xu ⋅ Yiping Ke ⋅ Lawrence Hsieh

Abstract

Unsupervised video domain adaptation (UVDA) is a practical but under-explored problem. In this paper, we propose a frustratingly easy UVDA method, called \emph{MetaTrans}. Specifically, \emph{MetaTrans} adopts a concise learning objective that contains only two fundamental loss terms. Despite the simplicity of the learning objective, \emph{MetaTrans} embodies an advanced UVDA idea, that is, handling the spatial and temporal divergence of cross-domain videos separately, through a subtle model architecture design. By implementing a temporal-static subtraction module, \emph{MetaTrans} effectively removes spatial and temporal divergence. Extensive empirical evaluations, particularly on various cross-domain action recognition tasks, show substantial absolute adaptation performance enhancement and significantly superior relative performance gain compared with state-of-the-art UVDA baselines.