Geometry-Misalignment in Distributional Learning
Abstract
Distributional learning problems optimize discrepancies between probability measures, including optimal transport or Sinkhorn divergence, yet are typically optimized using Euclidean first-order methods in parameter space. We show this mismatch is structural rather than algorithmic. We introduce geometry-misalignment, a local condition number that measures distortion between Euclidean geometry and the intrinsic geometry induced by a distributional objective. For a broad class of problems, we establish lower bounds demonstrating that Euclidean first-order methods incur an unavoidable convergence slowdown proportional to misalignment, even under intrinsic strong convexity and smoothness. We further prove geometry-aware preconditioned methods attain matching upper bounds independent of misalignment, yielding a sharp separation between Euclidean optimization and geometry-aware optimization. Beyond convergence rates, we show geometry-misalignment induces an optimization-dependent excess risk term under finite budgets, directly linking optimization geometry with statistical efficiency. We develop a geometry-calibrated optimization framework that estimates misalignment and selectively activates geometry-aware updates when necessary. Experiments on distribution matching for domain adaptation validate the theory, with improvements concentrated in high-misalignment regimes and negligible overhead.