Timezone: »
DeepEMD: A Transformer-based Fast Estimation of the Earth Mover's Distance
Atul Kumar Sinha · François Fleuret
Fri Jul 28 07:00 PM -- 08:00 PM (PDT) @
Event URL: https://openreview.net/forum?id=MST20kIvhA »
The Earth Mover's Distance (EMD) is the measure of choice to assess similarity between point clouds. However the computational cost of standard algorithms to compute it makes it prohibitive as a training loss, and the standard approach is to use a surrogate such as the Chamfer distance. We propose instead to use a deep model dubbed DeepEMD to directly get an estimate of the EMD. We formulate casting the prediction of the bipartite matching as that of an attention matrix, from which we get an accurate estimate of both the EMD, and its gradient. Experiments demonstrate not only the accuracy of this model, in particular even when test and train data are from different origins. Moreover, in our experiments, the model performs accurately when processing point clouds which are several times larger than those seen during training. Computation-wise, while the complexity of the exact Hungarian algorithm is $O(N^3)$, DeepEMD scales as $O(N^2)$, where $N$ is the total number of points. This leads to a $100\times$ wall-clock speed-up with $1024$ points. DeepEMD also achieves better performance than the standard Sinkhorn algorithm, with about $40\times$ speed-up. The availability of gradients allows DeepEMD to be used for training a VAE, leading to a model with lower reconstruction EMD than a standard baseline trained with Chamfer distance.
The Earth Mover's Distance (EMD) is the measure of choice to assess similarity between point clouds. However the computational cost of standard algorithms to compute it makes it prohibitive as a training loss, and the standard approach is to use a surrogate such as the Chamfer distance. We propose instead to use a deep model dubbed DeepEMD to directly get an estimate of the EMD. We formulate casting the prediction of the bipartite matching as that of an attention matrix, from which we get an accurate estimate of both the EMD, and its gradient. Experiments demonstrate not only the accuracy of this model, in particular even when test and train data are from different origins. Moreover, in our experiments, the model performs accurately when processing point clouds which are several times larger than those seen during training. Computation-wise, while the complexity of the exact Hungarian algorithm is $O(N^3)$, DeepEMD scales as $O(N^2)$, where $N$ is the total number of points. This leads to a $100\times$ wall-clock speed-up with $1024$ points. DeepEMD also achieves better performance than the standard Sinkhorn algorithm, with about $40\times$ speed-up. The availability of gradients allows DeepEMD to be used for training a VAE, leading to a model with lower reconstruction EMD than a standard baseline trained with Chamfer distance.
Author Information
Atul Kumar Sinha (University of Geneva)
François Fleuret (University of Geneva)
More from the Same Authors
-
2023 : Towards Efficient World Models »
Eloi Alonso · Vincent Micheli · François Fleuret -
2023 : 🎤 Fast Causal Attention with Dynamic Sparsity »
Daniele Paliotta · Matteo Pagliardini · Martin Jaggi · François Fleuret -
2023 Poster: Pareto Manifold Learning: Tackling multiple tasks via ensembles of single-task models »
Nikolaos Dimitriadis · Pascal Frossard · François Fleuret -
2020 Poster: Optimizer Benchmarking Needs to Account for Hyperparameter Tuning »
Prabhu Teja Sivaprasad · Florian Mai · Thijs Vogels · Martin Jaggi · François Fleuret -
2020 Poster: Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention »
Angelos Katharopoulos · Apoorv Vyas · Nikolaos Pappas · François Fleuret