Timezone: »
Marginal-likelihood based model-selection, even though promising, is rarely used in deep learning due to estimation difficulties. Instead, most approaches rely on validation data, which may not be readily available. In this work, we present a scalable marginal-likelihood estimation method to select both hyperparameters and network architectures, based on the training data alone. Some hyperparameters can be estimated online during training, simplifying the procedure. Our marginal-likelihood estimate is based on Laplace’s method and Gauss-Newton approximations to the Hessian, and it outperforms cross-validation and manual tuning on standard regression and image classification datasets, especially in terms of calibration and out-of-distribution detection. Our work shows that marginal likelihoods can improve generalization and be useful when validation data is unavailable (e.g., in nonstationary settings).
Author Information
Alexander Immer (ETH-Z, MPI-IS)
Matthias Bauer (DeepMind)
Vincent Fortuin (ETH Zürich)
I am doing my PhD in machine learning at ETH Zürich.
Gunnar Ratsch (ETH Zurich)
Khan Emtiyaz (RIKEN)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Spotlight: Scalable Marginal Likelihood Estimation for Model Selection in Deep Learning »
Tue. Jul 20th 01:20 -- 01:25 PM Room
More from the Same Authors
-
2023 : Delphic Offline Reinforcement Learning under Nonidentifiable Hidden Confounding »
Alizée Pace · Hugo Yèche · Bernhard Schölkopf · Gunnar Ratsch · Guy Tennenholtz -
2023 : Delphic Offline Reinforcement Learning under Nonidentifiable Hidden Confounding »
Alizée Pace · Hugo Yèche · Bernhard Schölkopf · Gunnar Ratsch · Guy Tennenholtz -
2023 : Delphic Offline Reinforcement Learning under Nonidentifiable Hidden Confounding »
Alizée Pace · Hugo Yèche · Bernhard Schölkopf · Gunnar Ratsch · Guy Tennenholtz -
2023 : Memory Maps to Understand Models »
Dharmesh Tailor · Paul Chang · Siddharth Swaroop · Eric Nalisnick · Arno Solin · Khan Emtiyaz -
2023 Workshop: Duality Principles for Modern Machine Learning »
Thomas Moellenhoff · Zelda Mariet · Mathieu Blondel · Khan Emtiyaz -
2023 Oral: Memory-Based Dual Gaussian Processes for Sequential Learning »
Paul Chang · Prakhar Verma · ST John · Arno Solin · Khan Emtiyaz -
2023 Poster: Stochastic Marginal Likelihood Gradients using Neural Tangent Kernels »
Alexander Immer · Tycho van der Ouderaa · Mark van der Wilk · Gunnar Ratsch · Bernhard Schölkopf -
2023 Poster: Simplifying Momentum-based Positive-definite Submanifold Optimization with Applications to Deep Learning »
Wu Lin · Valentin Duruisseaux · Melvin Leok · Frank Nielsen · Khan Emtiyaz · Mark Schmidt -
2023 Poster: Temporal Label Smoothing for Early Event Prediction »
Hugo Yèche · Alizée Pace · Gunnar Ratsch · Rita Kuznetsova -
2023 Poster: Memory-Based Dual Gaussian Processes for Sequential Learning »
Paul Chang · Prakhar Verma · ST John · Arno Solin · Khan Emtiyaz -
2021 Poster: PACOH: Bayes-Optimal Meta-Learning with PAC-Guarantees »
Jonas Rothfuss · Vincent Fortuin · Martin Josifoski · Andreas Krause -
2021 Poster: Generalized Doubly Reparameterized Gradient Estimators »
Matthias Bauer · Andriy Mnih -
2021 Spotlight: PACOH: Bayes-Optimal Meta-Learning with PAC-Guarantees »
Jonas Rothfuss · Vincent Fortuin · Martin Josifoski · Andreas Krause -
2021 Spotlight: Generalized Doubly Reparameterized Gradient Estimators »
Matthias Bauer · Andriy Mnih -
2021 Poster: Tractable structured natural-gradient descent using local parameterizations »
Wu Lin · Frank Nielsen · Khan Emtiyaz · Mark Schmidt -
2021 Spotlight: Tractable structured natural-gradient descent using local parameterizations »
Wu Lin · Frank Nielsen · Khan Emtiyaz · Mark Schmidt -
2020 Poster: Weakly-Supervised Disentanglement Without Compromises »
Francesco Locatello · Ben Poole · Gunnar Ratsch · Bernhard Schölkopf · Olivier Bachem · Michael Tschannen -
2019 Poster: Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations »
Francesco Locatello · Stefan Bauer · Mario Lucic · Gunnar Ratsch · Sylvain Gelly · Bernhard Schölkopf · Olivier Bachem -
2019 Oral: Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations »
Francesco Locatello · Stefan Bauer · Mario Lucic · Gunnar Ratsch · Sylvain Gelly · Bernhard Schölkopf · Olivier Bachem -
2018 Poster: On Matching Pursuit and Coordinate Descent »
Francesco Locatello · Anant Raj · Sai Praneeth Reddy Karimireddy · Gunnar Ratsch · Bernhard Schölkopf · Sebastian Stich · Martin Jaggi -
2018 Oral: On Matching Pursuit and Coordinate Descent »
Francesco Locatello · Anant Raj · Sai Praneeth Reddy Karimireddy · Gunnar Ratsch · Bernhard Schölkopf · Sebastian Stich · Martin Jaggi