Poster
Optimal Transport Barycenter via Nonconvex-Concave Minimax Optimization
Kaheon Kim · Rentian Yao · Changbo Zhu · Xiaohui Chen
West Exhibition Hall B2-B3 #W-611
Abstract:
The optimal transport barycenter (a.k.a. Wasserstein barycenter) is a fundamental notion of averaging that extends from the Euclidean space to the Wasserstein space of probability distributions. Computation of the *unregularized* barycenter for discretized probability distributions on point clouds is a challenging task when the domain dimension $d > 1$. Most practical algorithms for approximating the barycenter problem are based on entropic regularization. In this paper, we introduce a nearly linear time $O(m \log{m})$ and linear space complexity $O(m)$ primal-dual algorithm, the *Wasserstein-Descent $\dot{\mathbb{H}}^1$-Ascent* (WDHA) algorithm, for computing the *exact* barycenter when the input probability density functions are discretized on an $m$-point grid. The key success of the WDHA algorithm hinges on alternating between two different yet closely related Wasserstein and Sobolev optimization geometries for the primal barycenter and dual Kantorovich potential subproblems. Under reasonable assumptions, we establish the convergence rate and iteration complexity of WDHA to its stationary point when the step size is appropriately chosen. Superior computational efficacy, scalability, and accuracy over the existing Sinkhorn-type algorithms are demonstrated on high-resolution (e.g., $1024 \times 1024$ images) 2D synthetic and real data.
Lay Summary:
The Wasserstein barycenter represents the "average" of a given set of probability distributions. Computing this average exactly becomes challenging when the dimension of the domain is greater than one. Most existing methods simplify the problem by adding a smoothing term (called entropic regularization), but this introduces some inaccuracy. In this work, we present a new algorithm called *Wasserstein-Descent $\dot{\mathbb{H}}^1$-Ascent* (WDHA) that can compute the exact average efficiently, using nearly linear time and memory when the densities of the distributions are known on a common grid. We demonstrate that WDHA outperforms popular existing methods in both speed and accuracy on high-resolution (e.g., $1024 \times 1024$ images) 2D synthetic and real data.
Live content is unavailable. Log in and register to view live content