Timezone: »
Setting regularization parameters for Lasso-type estimators is notoriously difficult, though crucial for obtaining the best accuracy. The most popular hyperparameter optimization approach is grid-search on a held-out dataset. However, grid-search requires to choose a predefined grid of parameters and scales exponentially in the number of parameters. Another class of approaches casts hyperparameter optimization as a bi-level optimization problem, typically solved by gradient descent. The key challenge for these approaches is the estimation of the gradient w.r.t. the hyperparameters. Computing that gradient via forward or backward automatic differentiation usually suffers from high memory consumption, while implicit differentiation typically involves solving a linear system which can be prohibitive and numerically unstable. In addition, implicit differentiation usually assumes smooth loss functions, which is not the case of Lasso-type problems. This work introduces an efficient implicit differentiation algorithm, without matrix inversion, tailored for Lasso-type problems. Our proposal scales to high-dimensional data by leveraging the sparsity of the solutions. Empirically, we demonstrate that the proposed method outperforms a large number of standard methods for hyperparameter optimization.
Author Information
Quentin Bertrand (INRIA)
Quentin Klopfenstein (Université de Bourgogne Franche-Comté)
Mathieu Blondel (Google)
Samuel Vaiter (CNRS)
Alexandre Gramfort (Inria)
Joseph Salmon (Université de Montpellier)
More from the Same Authors
-
2023 Poster: FaDIn: Fast Discretized Inference for Hawkes Processes with General Parametric Kernels »
Guillaume Staerman · Cédric Allain · Alexandre Gramfort · Thomas Moreau -
2023 Poster: Fast, Differentiable and Sparse Top-k: a Convex Analysis Perspective »
Michael Sander · Joan Puigcerver · Josip Djolonga · Gabriel Peyré · Mathieu Blondel -
2022 Poster: Stochastic smoothing of the top-K calibrated hinge loss for deep imbalanced classification »
Camille Garcin · Maximilien Servajean · Alexis Joly · Joseph Salmon -
2022 Poster: Differentially Private Coordinate Descent for Composite Empirical Risk Minimization »
Paul Mangold · Aurélien Bellet · Joseph Salmon · Marc Tommasi -
2022 Spotlight: Differentially Private Coordinate Descent for Composite Empirical Risk Minimization »
Paul Mangold · Aurélien Bellet · Joseph Salmon · Marc Tommasi -
2022 Spotlight: Stochastic smoothing of the top-K calibrated hinge loss for deep imbalanced classification »
Camille Garcin · Maximilien Servajean · Alexis Joly · Joseph Salmon -
2021 Poster: Disentangling syntax and semantics in the brain with deep networks »
Charlotte Caucheteux · Alexandre Gramfort · Jean-Remi King -
2021 Spotlight: Disentangling syntax and semantics in the brain with deep networks »
Charlotte Caucheteux · Alexandre Gramfort · Jean-Remi King -
2021 Poster: Momentum Residual Neural Networks »
Michael Sander · Pierre Ablin · Mathieu Blondel · Gabriel Peyré -
2021 Spotlight: Momentum Residual Neural Networks »
Michael Sander · Pierre Ablin · Mathieu Blondel · Gabriel Peyré -
2020 Poster: Debiased Sinkhorn barycenters »
Hicham Janati · Marco Cuturi · Alexandre Gramfort -
2020 Poster: Fast Differentiable Sorting and Ranking »
Mathieu Blondel · Olivier Teboul · Quentin Berthet · Josip Djolonga -
2019 Poster: Geometric Losses for Distributional Learning »
Arthur Mensch · Mathieu Blondel · Gabriel Peyré -
2019 Oral: Geometric Losses for Distributional Learning »
Arthur Mensch · Mathieu Blondel · Gabriel Peyré -
2019 Poster: Optimal Mini-Batch and Step Sizes for SAGA »
Nidham Gazagnadou · Robert Gower · Joseph Salmon -
2019 Poster: Screening rules for Lasso with non-convex Sparse Regularizers »
alain rakotomamonjy · Gilles Gasso · Joseph Salmon -
2019 Oral: Optimal Mini-Batch and Step Sizes for SAGA »
Nidham Gazagnadou · Robert Gower · Joseph Salmon -
2019 Oral: Screening rules for Lasso with non-convex Sparse Regularizers »
alain rakotomamonjy · Gilles Gasso · Joseph Salmon -
2019 Poster: Safe Grid Search with Optimal Complexity »
Eugene Ndiaye · Tam Le · Olivier Fercoq · Joseph Salmon · Ichiro Takeuchi -
2019 Oral: Safe Grid Search with Optimal Complexity »
Eugene Ndiaye · Tam Le · Olivier Fercoq · Joseph Salmon · Ichiro Takeuchi -
2018 Poster: Celer: a Fast Solver for the Lasso with Dual Extrapolation »
Mathurin MASSIAS · Joseph Salmon · Alexandre Gramfort -
2018 Oral: Celer: a Fast Solver for the Lasso with Dual Extrapolation »
Mathurin MASSIAS · Joseph Salmon · Alexandre Gramfort -
2018 Poster: Differentiable Dynamic Programming for Structured Prediction and Attention »
Arthur Mensch · Mathieu Blondel -
2018 Oral: Differentiable Dynamic Programming for Structured Prediction and Attention »
Arthur Mensch · Mathieu Blondel -
2018 Poster: SparseMAP: Differentiable Sparse Structured Inference »
Vlad Niculae · Andre Filipe Torres Martins · Mathieu Blondel · Claire Cardie -
2018 Oral: SparseMAP: Differentiable Sparse Structured Inference »
Vlad Niculae · Andre Filipe Torres Martins · Mathieu Blondel · Claire Cardie -
2017 Poster: Soft-DTW: a Differentiable Loss Function for Time-Series »
Marco Cuturi · Mathieu Blondel -
2017 Talk: Soft-DTW: a Differentiable Loss Function for Time-Series »
Marco Cuturi · Mathieu Blondel