Skip to yearly menu bar Skip to main content


Poster
in
Workshop: 2nd Annual Workshop on Topology, Algebra, and Geometry in Machine Learning (TAG-ML)

DeepEMD: A Transformer-based Fast Estimation of the Earth Mover's Distance

Atul Kumar Sinha · François Fleuret


Abstract: The Earth Mover's Distance (EMD) is the measure of choice to assess similarity between point clouds. However the computational cost of standard algorithms to compute it makes it prohibitive as a training loss, and the standard approach is to use a surrogate such as the Chamfer distance. We propose instead to use a deep model dubbed DeepEMD to directly get an estimate of the EMD. We formulate casting the prediction of the bipartite matching as that of an attention matrix, from which we get an accurate estimate of both the EMD, and its gradient. Experiments demonstrate not only the accuracy of this model, in particular even when test and train data are from different origins. Moreover, in our experiments, the model performs accurately when processing point clouds which are several times larger than those seen during training. Computation-wise, while the complexity of the exact Hungarian algorithm is $O(N^3)$, DeepEMD scales as $O(N^2)$, where $N$ is the total number of points. This leads to a $100\times$ wall-clock speed-up with $1024$ points. DeepEMD also achieves better performance than the standard Sinkhorn algorithm, with about $40\times$ speed-up. The availability of gradients allows DeepEMD to be used for training a VAE, leading to a model with lower reconstruction EMD than a standard baseline trained with Chamfer distance.

Chat is not available.