Contrastive Flow Map Matching
Abstract
Flow map matching (FMM) enables one- and few-step sampling for diffusion-style generation, yet its performance is often hindered by the mismatch between ground-truth training transitions and model-induced flow maps. We propose \textbf{Contrastive Flow Map Matching (CFMM)}, a principled framework that explicitly aligns FMM training with practical sampling. Our approach is grounded in a theoretical upper bound on the reverse KL divergence, which decomposes the distributional gap into a marginal mismatch over intermediate states and a conditional mismatch in endpoint reconstruction. This analysis motivates two complementary objectives: average-velocity regression for marginal alignment and a sampling-aligned InfoNCE contrastive loss for conditional refinement. CFMM is a training-only plug-in for pre-trained FMMs, incurs no inference-time overhead, and supports training FMMs from scratch. Experiments on CIFAR-10, ImageNet, and LSUN across multiple FMM baselines demonstrate consistent improvements in fidelity and perceptual quality with only modest additional training cost.