Timezone: »

Outlier-Robust Optimal Transport with Applications to Generative Modeling and Data Privacy
Sloan Nietert · Rachel Cummings · Ziv Goldfeld
The Wasserstein distance, rooted in optimal transport (OT) theory, is a popular discrepancy measure between probability distributions with various applications to statistics and machine learning. Despite their rich structure and demonstrated utility, Wasserstein distances are sensitive to outliers in the considered distributions, which hinders applicability in practice. This motivates us to propose a new outlier-robust Wasserstein distance, denoted $W^{\delta}_p$, that identifies probability measures within total variation distance $\delta$ of each other. We conduct a thorough theoretical study of $W^{\delta}_p$, encompassing characterization of optimal perturbations, regularity, duality, and statistical results. In particular, we derive a remarkably simple dual form for $W^{\delta}_p$ that lends itself to implementation via an elementary modification to standard, duality-based OT solvers. We also reveal connections between robust OT and Pufferfish privacy (PP), a generalization of differential privacy, demonstrating that $W^{\delta}_{\infty}$ naturally generalizes the relation between $W_{\infty}$ and $(\epsilon,0)$-PP to the $(\epsilon,\delta)$-privacy regime.

Author Information

Sloan Nietert (Cornell University)
Rachel Cummings (Columbia University)
Ziv Goldfeld (Cornell University)

More from the Same Authors