Timezone: »
Bayesian neural networks (BNNs) demonstrate promising success in improving the robustness and uncertainty quantification of modern deep learning. However, they generally struggle with underfitting at scale and parameter efficiency. On the other hand, deep ensembles have emerged as alternatives for uncertainty quantification that, while outperforming BNNs on certain problems, also suffer from efficiency issues. It remains unclear how to combine the strengths of these two approaches and remediate their common issues. To tackle this challenge, we propose a rank-1 parameterization of BNNs, where each weight matrix involves only a distribution on a rank-1 subspace. We also revisit the use of mixture approximate posteriors to capture multiple modes, where unlike typical mixtures, this approach admits a significantly smaller memory increase (e.g., only a 0.4% increase for a ResNet-50 mixture of size 10). We perform a systematic empirical study on the choices of prior, variational posterior, and methods to improve training. For ResNet-50 on ImageNet, Wide ResNet 28-10 on CIFAR-10/100, and an RNN on MIMIC-III, rank-1 BNNs achieve state-of-the-art performance across log-likelihood, accuracy, and calibration on the test sets and out-of-distribution variants.
Author Information
Mike Dusenberry (Google Brain (AI Residency))
Ghassen Jerfel (Duke University / Google Brain / UC Berkeley)
Yeming Wen (University of Toronto)
Yian Ma (Google)
Jasper Snoek (Google Brain)
Katherine Heller (Google)
Balaji Lakshminarayanan (Google Brain)
Dustin Tran (Google)
More from the Same Authors
-
2020 Workshop: Uncertainty and Robustness in Deep Learning Workshop (UDL) »
Sharon Yixuan Li · Balaji Lakshminarayanan · Dan Hendrycks · Thomas Dietterich · Jasper Snoek -
2020 Poster: The k-tied Normal Distribution: A Compact Parameterization of Gaussian Mean Field Posteriors in Bayesian Neural Networks »
Jakub Swiatkowski · Kevin Roth · Bastiaan Veeling · Linh Tran · Joshua V Dillon · Jasper Snoek · Stephan Mandt · Tim Salimans · Rodolphe Jenatton · Sebastian Nowozin -
2020 Poster: On Thompson Sampling with Langevin Algorithms »
Eric Mazumdar · Aldo Pacchiano · Yian Ma · Michael Jordan · Peter Bartlett -
2020 Poster: Black-Box Variational Inference as a Parametric Approximation to Langevin Dynamics »
Matthew Hoffman · Yian Ma -
2020 Poster: How Good is the Bayes Posterior in Deep Neural Networks Really? »
Florian Wenzel · Kevin Roth · Bastiaan Veeling · Jakub Swiatkowski · Linh Tran · Stephan Mandt · Jasper Snoek · Tim Salimans · Rodolphe Jenatton · Sebastian Nowozin -
2019 Workshop: Uncertainty and Robustness in Deep Learning »
Sharon Yixuan Li · Dan Hendrycks · Thomas Dietterich · Balaji Lakshminarayanan · Justin Gilmer -
2019 Poster: Learning from Delayed Outcomes via Proxies with Applications to Recommender Systems »
Timothy Mann · Sven Gowal · András György · Huiyi Hu · Ray Jiang · Balaji Lakshminarayanan · Prav Srinivasan -
2019 Oral: Learning from Delayed Outcomes via Proxies with Applications to Recommender Systems »
Timothy Mann · Sven Gowal · András György · Huiyi Hu · Ray Jiang · Balaji Lakshminarayanan · Prav Srinivasan -
2019 Oral: Hybrid Models with Deep and Invertible Features »
Eric Nalisnick · Akihiro Matsukawa · Yee Whye Teh · Dilan Gorur · Balaji Lakshminarayanan -
2019 Poster: Hybrid Models with Deep and Invertible Features »
Eric Nalisnick · Akihiro Matsukawa · Yee Whye Teh · Dilan Gorur · Balaji Lakshminarayanan -
2018 Poster: Image Transformer »
Niki Parmar · Ashish Vaswani · Jakob Uszkoreit · Lukasz Kaiser · Noam Shazeer · Alexander Ku · Dustin Tran -
2018 Oral: Image Transformer »
Niki Parmar · Ashish Vaswani · Jakob Uszkoreit · Lukasz Kaiser · Noam Shazeer · Alexander Ku · Dustin Tran -
2017 Workshop: Implicit Generative Models »
Rajesh Ranganath · Ian Goodfellow · Dustin Tran · David Blei · Balaji Lakshminarayanan · Shakir Mohamed