Timezone: »
Structured prediction requires searching over a combinatorial number of structures. To tackle it, we introduce SparseMAP, a new method for sparse structured inference, together with corresponding loss functions. SparseMAP inference is able to automatically select only a few global structures: it is situated between MAP inference, which picks a single structure, and marginal inference, which assigns probability mass to all structures, including implausible ones. Importantly, SparseMAP can be computed using only calls to a MAP oracle, hence it is applicable even to problems where marginal inference is intractable, such as linear assignment. Moreover, thanks to the solution sparsity, gradient backpropagation is efficient regardless of the structure. SparseMAP thus enables us to augment deep neural networks with generic and sparse structured hidden layers. Experiments in dependency parsing and natural language inference reveal competitive accuracy, improved interpretability, and the ability to capture natural language ambiguities, which is attractive for pipeline systems.
Author Information
Vlad Niculae (Cornell University)
Andre Filipe Torres Martins (Instituto de Telecomunicacoes)
Mathieu Blondel (NTT)
Claire Cardie (Cornell University)
Related Events (a corresponding poster, oral, or spotlight)
-
2018 Poster: SparseMAP: Differentiable Sparse Structured Inference »
Wed. Jul 11th 04:15 -- 07:00 PM Room Hall B #66
More from the Same Authors
-
2022 Poster: Modeling Structure with Undirected Neural Networks »
Tsvetomila Mihaylova · Vlad Niculae · Andre Filipe Torres Martins -
2022 Spotlight: Modeling Structure with Undirected Neural Networks »
Tsvetomila Mihaylova · Vlad Niculae · Andre Filipe Torres Martins -
2020 Poster: LP-SparseMAP: Differentiable Relaxed Optimization for Sparse Structured Prediction »
Vlad Niculae · Andre Filipe Torres Martins -
2018 Poster: Differentiable Dynamic Programming for Structured Prediction and Attention »
Arthur Mensch · Mathieu Blondel -
2018 Oral: Differentiable Dynamic Programming for Structured Prediction and Attention »
Arthur Mensch · Mathieu Blondel -
2017 Poster: Soft-DTW: a Differentiable Loss Function for Time-Series »
Marco Cuturi · Mathieu Blondel -
2017 Talk: Soft-DTW: a Differentiable Loss Function for Time-Series »
Marco Cuturi · Mathieu Blondel