Timezone: »
We present a framework based on bilevel optimization for learning multilayer, deep data representations. On the one hand, the lower-level problem finds a representation by successively minimizing layer-wise objectives made of the sum of a prescribed regularizer as well as a fidelity term and some linear function both depending on the representation found at the previous layer. On the other hand, the upper-level problem optimizes over the linear functions to yield a linearly separable final representation. We show that, by choosing the fidelity term as the quadratic distance between two successive layer-wise representations, the bilevel problem reduces to the training of a feed-forward neural network. Instead, by elaborating on Bregman distances, we devise a novel neural network architecture additionally involving the inverse of the activation function reminiscent of the skip connection used in ResNets. Numerical experiments suggest that the proposed Bregman variant benefits from better learning properties and more robust prediction performance.
Author Information
Jordan Frecon (INSA Rouen)
Gilles Gasso (INSA Rouen)
Massimiliano Pontil ( Istituto Italiano di Tecnologia & University College London)
Saverio Salzo (Istituto Italiano di Tecnologia)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Poster: Bregman Neural Networks »
Thu. Jul 21st through Fri the 22nd Room Hall E #438
More from the Same Authors
-
2022 Poster: Batch Greenkhorn Algorithm for Entropic-Regularized Multimarginal Optimal Transport: Linear Rate of Convergence and Iteration Complexity »
Vladimir Kostic · Saverio Salzo · Massimiliano Pontil -
2022 Spotlight: Batch Greenkhorn Algorithm for Entropic-Regularized Multimarginal Optimal Transport: Linear Rate of Convergence and Iteration Complexity »
Vladimir Kostic · Saverio Salzo · Massimiliano Pontil -
2021 Poster: Robust Unsupervised Learning via L-statistic Minimization »
Andreas Maurer · Daniela Angela Parletta · Andrea Paudice · Massimiliano Pontil -
2021 Spotlight: Robust Unsupervised Learning via L-statistic Minimization »
Andreas Maurer · Daniela Angela Parletta · Andrea Paudice · Massimiliano Pontil -
2020 Poster: On the Iteration Complexity of Hypergradient Computation »
Riccardo Grazzi · Luca Franceschi · Massimiliano Pontil · Saverio Salzo -
2019 Poster: Screening rules for Lasso with non-convex Sparse Regularizers »
alain rakotomamonjy · Gilles Gasso · Joseph Salmon -
2019 Oral: Screening rules for Lasso with non-convex Sparse Regularizers »
alain rakotomamonjy · Gilles Gasso · Joseph Salmon -
2018 Poster: Bilevel Programming for Hyperparameter Optimization and Meta-Learning »
Luca Franceschi · Paolo Frasconi · Saverio Salzo · Riccardo Grazzi · Massimiliano Pontil -
2018 Oral: Bilevel Programming for Hyperparameter Optimization and Meta-Learning »
Luca Franceschi · Paolo Frasconi · Saverio Salzo · Riccardo Grazzi · Massimiliano Pontil