Stable Velocity: A Variance Perspective on Flow Matching
Donglin Yang ⋅ Yongxing Zhang ⋅ Xin Yu ⋅ Liang Hou ⋅ Xin Tao ⋅ Pengfei Wan ⋅ XIAOJUAN QI ⋅ Renjie Liao
Abstract
While flow matching is elegant, its reliance on single-sample conditional velocities leads to high-variance training targets that destabilize optimization and slow convergence. By explicitly characterizing this variance, we identify 1) a *high-variance regime* near the prior, where optimization is challenging, and 2) a *low-variance regime* near the data distribution, where conditional and marginal velocities nearly coincide. Leveraging this insight, we propose **Stable Velocity**, a unified framework that improves both training and sampling. For training, we introduce Stable Velocity Matching (StableVM), an unbiased variance-reduction objective, along with Variance-Aware Representation Alignment (VA-REPA), which adaptively strengthen auxiliary supervision in the *low-variance regime*. For inference, we show that dynamics in the *low-variance regime* admit closed-form simplifications, enabling Stable Velocity Sampling (StableVS), a finetuning-free acceleration. Extensive experiments on ImageNet $256\times256$ and large pretrained text-to-image and text-to-video models, including SD3.5, Flux, Qwen-Image, and Wan2.2, demonstrate consistent improvements in training efficiency and more than $2\times$ faster sampling within the *low-variance regime* without degrading sample quality.
Successful Page Load