Timezone: »
Dropout regularization of deep neural networks has been a mysterious yet effective tool to prevent overfitting. Explanations for its success range from the prevention of "co-adapted" weights to it being a form of cheap Bayesian inference. We propose a novel framework for understanding multiplicative noise in neural networks, considering continuous distributions as well as Bernoulli noise (i.e. dropout). We show that multiplicative noise induces structured shrinkage priors on a network's weights. We derive the equivalence through reparametrization properties of scale mixtures and without invoking any approximations. Given the equivalence, we then show that dropout's Monte Carlo training objective approximates marginal MAP estimation. We extend this framework to ResNets, terming the prior "automatic depth determination" as it is the natural analog of "automatic relevance determination" for network depth. Lastly, we investigate two inference strategies that improve upon the aforementioned MAP approximation in regression benchmarks.
Author Information
Eric Nalisnick (University of Cambridge & DeepMind)
Jose Miguel Hernandez-Lobato (University of Cambridge)
Padhraic Smyth (UC Irvine)
Related Events (a corresponding poster, oral, or spotlight)
-
2019 Poster: Dropout as a Structured Shrinkage Prior »
Fri. Jun 14th 01:30 -- 04:00 AM Room Pacific Ballroom #84
More from the Same Authors
-
2023 : Leveraging Task Structures for Improved Identifiability in Neural Network Representations »
Wenlin Chen · Julien Horwood · Juyeon Heo · Jose Miguel Hernandez-Lobato -
2023 : Minimal Random Code Learning with Mean-KL Parameterization »
Jihao Andreas Lin · Gergely Flamich · Jose Miguel Hernandez-Lobato -
2023 Poster: Deep Anomaly Detection under Labeling Budget Constraints »
Aodong Li · Chen Qiu · Marius Kloft · Padhraic Smyth · Stephan Mandt · Maja Rudolph -
2022 Poster: Adapting the Linearised Laplace Model Evidence for Modern Deep Learning »
Javier Antorán · David Janz · James Allingham · Erik Daxberger · Riccardo Barbano · Eric Nalisnick · Jose Miguel Hernandez-Lobato -
2022 Spotlight: Adapting the Linearised Laplace Model Evidence for Modern Deep Learning »
Javier Antorán · David Janz · James Allingham · Erik Daxberger · Riccardo Barbano · Eric Nalisnick · Jose Miguel Hernandez-Lobato -
2022 Poster: Fair Generalized Linear Models with a Convex Penalty »
Hyungrok Do · Preston Putzel · Axel Martin · Padhraic Smyth · Judy Zhong -
2022 Poster: Action-Sufficient State Representation Learning for Control with Structural Constraints »
Biwei Huang · Chaochao Lu · Liu Leqi · Jose Miguel Hernandez-Lobato · Clark Glymour · Bernhard Schölkopf · Kun Zhang -
2022 Spotlight: Action-Sufficient State Representation Learning for Control with Structural Constraints »
Biwei Huang · Chaochao Lu · Liu Leqi · Jose Miguel Hernandez-Lobato · Clark Glymour · Bernhard Schölkopf · Kun Zhang -
2022 Spotlight: Fair Generalized Linear Models with a Convex Penalty »
Hyungrok Do · Preston Putzel · Axel Martin · Padhraic Smyth · Judy Zhong -
2022 Poster: Fast Relative Entropy Coding with A* coding »
Gergely Flamich · Stratis Markou · Jose Miguel Hernandez-Lobato -
2022 Spotlight: Fast Relative Entropy Coding with A* coding »
Gergely Flamich · Stratis Markou · Jose Miguel Hernandez-Lobato -
2021 Poster: Active Slices for Sliced Stein Discrepancy »
Wenbo Gong · Kaibo Zhang · Yingzhen Li · Jose Miguel Hernandez-Lobato -
2021 Spotlight: Active Slices for Sliced Stein Discrepancy »
Wenbo Gong · Kaibo Zhang · Yingzhen Li · Jose Miguel Hernandez-Lobato -
2021 Poster: A Gradient Based Strategy for Hamiltonian Monte Carlo Hyperparameter Optimization »
Andrew Campbell · Wenlong Chen · Vincent Stimper · Jose Miguel Hernandez-Lobato · Yichuan Zhang -
2021 Spotlight: A Gradient Based Strategy for Hamiltonian Monte Carlo Hyperparameter Optimization »
Andrew Campbell · Wenlong Chen · Vincent Stimper · Jose Miguel Hernandez-Lobato · Yichuan Zhang -
2021 Poster: Bayesian Deep Learning via Subnetwork Inference »
Erik Daxberger · Eric Nalisnick · James Allingham · Javier Antorán · Jose Miguel Hernandez-Lobato -
2021 Spotlight: Bayesian Deep Learning via Subnetwork Inference »
Erik Daxberger · Eric Nalisnick · James Allingham · Javier Antorán · Jose Miguel Hernandez-Lobato -
2020 : "Latent Space Optimization with Deep Generative Models" »
Jose Miguel Hernandez-Lobato -
2020 : Invited talk 2: Detecting Distribution Shift with Deep Generative Models »
Eric Nalisnick -
2020 : Invited Talk: Efficient Missing-value Acquisition with Variational Autoencoders »
Jose Miguel Hernandez-Lobato -
2020 Poster: Reinforcement Learning for Molecular Design Guided by Quantum Mechanics »
Gregor Simm · Robert Pinsler · Jose Miguel Hernandez-Lobato -
2020 Poster: A Generative Model for Molecular Distance Geometry »
Gregor Simm · Jose Miguel Hernandez-Lobato -
2019 Oral: Hybrid Models with Deep and Invertible Features »
Eric Nalisnick · Akihiro Matsukawa · Yee-Whye Teh · Dilan Gorur · Balaji Lakshminarayanan -
2019 Poster: EDDI: Efficient Dynamic Discovery of High-Value Information with Partial VAE »
Chao Ma · Sebastian Tschiatschek · Konstantina Palla · Jose Miguel Hernandez-Lobato · Sebastian Nowozin · Cheng Zhang -
2019 Poster: Variational Implicit Processes »
Chao Ma · Yingzhen Li · Jose Miguel Hernandez-Lobato -
2019 Poster: Hybrid Models with Deep and Invertible Features »
Eric Nalisnick · Akihiro Matsukawa · Yee-Whye Teh · Dilan Gorur · Balaji Lakshminarayanan -
2019 Oral: Variational Implicit Processes »
Chao Ma · Yingzhen Li · Jose Miguel Hernandez-Lobato -
2019 Oral: EDDI: Efficient Dynamic Discovery of High-Value Information with Partial VAE »
Chao Ma · Sebastian Tschiatschek · Konstantina Palla · Jose Miguel Hernandez-Lobato · Sebastian Nowozin · Cheng Zhang -
2018 Poster: Decomposition of Uncertainty in Bayesian Deep Learning for Efficient and Risk-sensitive Learning »
Stefan Depeweg · Jose Miguel Hernandez-Lobato · Finale Doshi-Velez · Steffen Udluft -
2018 Oral: Decomposition of Uncertainty in Bayesian Deep Learning for Efficient and Risk-sensitive Learning »
Stefan Depeweg · Jose Miguel Hernandez-Lobato · Finale Doshi-Velez · Steffen Udluft -
2017 Poster: Parallel and Distributed Thompson Sampling for Large-scale Accelerated Exploration of Chemical Space »
Jose Miguel Hernandez-Lobato · James Requeima · Edward Pyzer-Knapp · Alan Aspuru-Guzik -
2017 Poster: Grammar Variational Autoencoder »
Matt J. Kusner · Brooks Paige · Jose Miguel Hernandez-Lobato -
2017 Talk: Grammar Variational Autoencoder »
Matt J. Kusner · Brooks Paige · Jose Miguel Hernandez-Lobato -
2017 Talk: Parallel and Distributed Thompson Sampling for Large-scale Accelerated Exploration of Chemical Space »
Jose Miguel Hernandez-Lobato · James Requeima · Edward Pyzer-Knapp · Alan Aspuru-Guzik -
2017 Poster: Sequence Tutor: Conservative fine-tuning of sequence generation models with KL-control »
Natasha Jaques · Shixiang Gu · Dzmitry Bahdanau · Jose Miguel Hernandez-Lobato · Richard E Turner · Douglas Eck -
2017 Talk: Sequence Tutor: Conservative fine-tuning of sequence generation models with KL-control »
Natasha Jaques · Shixiang Gu · Dzmitry Bahdanau · Jose Miguel Hernandez-Lobato · Richard E Turner · Douglas Eck