Timezone: »
In many applications of deep learning, it is crucial to capture model and prediction uncertainty. Unlike classic neural networks (NN), Bayesian neural networks (BNN) allow us to reason about uncertainty in a more principled way. Stochastic Gradient Langevin Dynamics (SGLD) enables learning a BNN with only simple modifications to the standard optimization framework (SGD). Instead of obtaining a single point-estimate of the model, the result of SGLD is samples from the BNN posterior. However, SGLD and its extensions require storage of the entire history of model parameters, a potentially prohibitive cost (especially for large neural networks).We propose a framework, Adversarial Posterior Distillation, to distill the SGLD samples using Generative Adversarial Networks (GAN). At test-time, samples are generated by the GAN. We show that this distillation framework incurs no loss in performance on recent BNN applications including anomaly detection, active learning, and defense against attacks. By construction, our framework not only distills the Bayesian predictive distribution, but the posterior itself. This allows users to compute quantity such as the approximate model variance, which is useful in the downstream tasks.
Author Information
Kuan-Chieh Wang (Univeristy of Toronto)
Paul Vicol (University of Toronto)
James Lucas (University of Toronto)
Li Gu (University of Toronto)
Roger Grosse (University of Toronto and Vector Institute)
Richard Zemel (Vector Institute)
Related Events (a corresponding poster, oral, or spotlight)
-
2018 Poster: Distilling the Posterior in Bayesian Neural Networks »
Thu. Jul 12th 04:15 -- 07:00 PM Room Hall B #77
More from the Same Authors
-
2021 : Online Algorithmic Recourse by Collective Action »
Elliot Creager · Richard Zemel -
2022 : Towards Environment-Invariant Representation Learning for Robust Task Transfer »
Benjamin Eyre · Richard Zemel · Elliot Creager -
2023 : Out of the Ordinary: Spectrally Adapting Regression for Covariate Shift »
Benjamin Eyre · Elliot Creager · David Madras · Vardan Papyan · Richard Zemel -
2023 : Statistics estimation in neural network training: a recursive identification approach »
Ruth Crasto · Xuchan Bao · Roger Grosse -
2023 : Calibrating Language Models via Augmented Prompt Ensembles »
Mingjian Jiang · Yangjun Ruan · Sicong Huang · Saifei Liao · Silviu Pitis · Roger Grosse · Jimmy Ba -
2023 Poster: Low-Variance Gradient Estimation in Unrolled Computation Graphs with ES-Single »
Paul Vicol -
2023 Test Of Time: Learning Fair Representations »
Richard Zemel · Yu Wu · Kevin Swersky · Toniann Pitassi · Cynthia Dwork -
2023 Poster: Efficient Parametric Approximations of Neural Network Function Space Distance »
Nikita Dhawan · Sicong Huang · Juhan Bae · Roger Grosse -
2022 : Invited talks 3, Q/A, Amy, Rich and Liting »
Liting Sun · Amy Zhang · Richard Zemel -
2022 : Invited talks 3, Amy Zhang, Rich Zemel and Liting Sun »
Amy Zhang · Richard Zemel · Liting Sun -
2022 Poster: On Implicit Bias in Overparameterized Bilevel Optimization »
Paul Vicol · Jonathan Lorraine · Fabian Pedregosa · David Duvenaud · Roger Grosse -
2022 Spotlight: On Implicit Bias in Overparameterized Bilevel Optimization »
Paul Vicol · Jonathan Lorraine · Fabian Pedregosa · David Duvenaud · Roger Grosse -
2021 : Implicit Regularization in Overparameterized Bilevel Optimization »
Paul Vicol -
2021 Poster: Scalable Variational Gaussian Processes via Harmonic Kernel Decomposition »
Shengyang Sun · Jiaxin Shi · Andrew Wilson · Roger Grosse -
2021 Poster: SketchEmbedNet: Learning Novel Concepts by Imitating Drawings »
Alexander Wang · Mengye Ren · Richard Zemel -
2021 Poster: Learning a Universal Template for Few-shot Dataset Generalization »
Eleni Triantafillou · Hugo Larochelle · Richard Zemel · Vincent Dumoulin -
2021 Poster: Environment Inference for Invariant Learning »
Elliot Creager · Joern-Henrik Jacobsen · Richard Zemel -
2021 Spotlight: Learning a Universal Template for Few-shot Dataset Generalization »
Eleni Triantafillou · Hugo Larochelle · Richard Zemel · Vincent Dumoulin -
2021 Spotlight: Environment Inference for Invariant Learning »
Elliot Creager · Joern-Henrik Jacobsen · Richard Zemel -
2021 Spotlight: SketchEmbedNet: Learning Novel Concepts by Imitating Drawings »
Alexander Wang · Mengye Ren · Richard Zemel -
2021 Spotlight: Scalable Variational Gaussian Processes via Harmonic Kernel Decomposition »
Shengyang Sun · Jiaxin Shi · Andrew Wilson · Roger Grosse -
2021 Poster: LIME: Learning Inductive Bias for Primitives of Mathematical Reasoning »
Yuhuai Wu · Markus Rabe · Wenda Li · Jimmy Ba · Roger Grosse · Christian Szegedy -
2021 Poster: Unbiased Gradient Estimation in Unrolled Computation Graphs with Persistent Evolution Strategies »
Paul Vicol · Luke Metz · Jascha Sohl-Dickstein -
2021 Oral: Unbiased Gradient Estimation in Unrolled Computation Graphs with Persistent Evolution Strategies »
Paul Vicol · Luke Metz · Jascha Sohl-Dickstein -
2021 Spotlight: LIME: Learning Inductive Bias for Primitives of Mathematical Reasoning »
Yuhuai Wu · Markus Rabe · Wenda Li · Jimmy Ba · Roger Grosse · Christian Szegedy -
2021 Poster: On Monotonic Linear Interpolation of Neural Network Parameters »
James Lucas · Juhan Bae · Michael Zhang · Stanislav Fort · Richard Zemel · Roger Grosse -
2021 Spotlight: On Monotonic Linear Interpolation of Neural Network Parameters »
James Lucas · Juhan Bae · Michael Zhang · Stanislav Fort · Richard Zemel · Roger Grosse -
2020 : Invited Talk 4: Prof. Richard Zemel from University of Toronto »
Richard Zemel -
2020 : Spotlight Talk 4: Few-shot Out-of-Distribution Detection »
Kuan-Chieh Wang -
2020 Workshop: Participatory Approaches to Machine Learning »
Angela Zhou · David Madras · Deborah Raji · Smitha Milli · Bogdan Kulynych · Richard Zemel -
2020 Poster: Causal Modeling for Fairness In Dynamical Systems »
Elliot Creager · David Madras · Toniann Pitassi · Richard Zemel -
2020 Poster: Optimizing Long-term Social Welfare in Recommender Systems: A Constrained Matching Approach »
Martin Mladenov · Elliot Creager · Omer Ben-Porat · Kevin Swersky · Richard Zemel · Craig Boutilier -
2020 Poster: Evaluating Lossy Compression Rates of Deep Generative Models »
Sicong Huang · Alireza Makhzani · Yanshuai Cao · Roger Grosse -
2020 Poster: Learning the Stein Discrepancy for Training and Evaluating Energy-Based Models without Sampling »
Will Grathwohl · Kuan-Chieh Wang · Joern-Henrik Jacobsen · David Duvenaud · Richard Zemel -
2019 Workshop: Learning and Reasoning with Graph-Structured Representations »
Ethan Fetaya · Zhiting Hu · Thomas Kipf · Yujia Li · Xiaodan Liang · Renjie Liao · Raquel Urtasun · Hao Wang · Max Welling · Eric Xing · Richard Zemel -
2019 Poster: Lorentzian Distance Learning for Hyperbolic Representations »
Marc Law · Renjie Liao · Jake Snell · Richard Zemel -
2019 Poster: Flexibly Fair Representation Learning by Disentanglement »
Elliot Creager · David Madras · Joern-Henrik Jacobsen · Marissa Weis · Kevin Swersky · Toniann Pitassi · Richard Zemel -
2019 Oral: Lorentzian Distance Learning for Hyperbolic Representations »
Marc Law · Renjie Liao · Jake Snell · Richard Zemel -
2019 Oral: Flexibly Fair Representation Learning by Disentanglement »
Elliot Creager · David Madras · Joern-Henrik Jacobsen · Marissa Weis · Kevin Swersky · Toniann Pitassi · Richard Zemel -
2019 Poster: Understanding the Origins of Bias in Word Embeddings »
Marc-Etienne Brunet · Colleen Alkalay-Houlihan · Ashton Anderson · Richard Zemel -
2019 Poster: Sorting Out Lipschitz Function Approximation »
Cem Anil · James Lucas · Roger Grosse -
2019 Poster: EigenDamage: Structured Pruning in the Kronecker-Factored Eigenbasis »
Chaoqi Wang · Roger Grosse · Sanja Fidler · Guodong Zhang -
2019 Oral: Understanding the Origins of Bias in Word Embeddings »
Marc-Etienne Brunet · Colleen Alkalay-Houlihan · Ashton Anderson · Richard Zemel -
2019 Oral: EigenDamage: Structured Pruning in the Kronecker-Factored Eigenbasis »
Chaoqi Wang · Roger Grosse · Sanja Fidler · Guodong Zhang -
2019 Oral: Sorting Out Lipschitz Function Approximation »
Cem Anil · James Lucas · Roger Grosse -
2018 Poster: Learning Adversarially Fair and Transferable Representations »
David Madras · Elliot Creager · Toniann Pitassi · Richard Zemel -
2018 Oral: Learning Adversarially Fair and Transferable Representations »
David Madras · Elliot Creager · Toniann Pitassi · Richard Zemel -
2018 Poster: Reviving and Improving Recurrent Back-Propagation »
Renjie Liao · Yuwen Xiong · Ethan Fetaya · Lisa Zhang · Kijung Yoon · Zachary S Pitkow · Raquel Urtasun · Richard Zemel -
2018 Poster: Noisy Natural Gradient as Variational Inference »
Guodong Zhang · Shengyang Sun · David Duvenaud · Roger Grosse -
2018 Oral: Noisy Natural Gradient as Variational Inference »
Guodong Zhang · Shengyang Sun · David Duvenaud · Roger Grosse -
2018 Oral: Reviving and Improving Recurrent Back-Propagation »
Renjie Liao · Yuwen Xiong · Ethan Fetaya · Lisa Zhang · Kijung Yoon · Zachary S Pitkow · Raquel Urtasun · Richard Zemel -
2018 Poster: Neural Relational Inference for Interacting Systems »
Thomas Kipf · Ethan Fetaya · Kuan-Chieh Wang · Max Welling · Richard Zemel -
2018 Poster: Differentiable Compositional Kernel Learning for Gaussian Processes »
Shengyang Sun · Guodong Zhang · Chaoqi Wang · Wenyuan Zeng · Jiaman Li · Roger Grosse -
2018 Oral: Neural Relational Inference for Interacting Systems »
Thomas Kipf · Ethan Fetaya · Kuan-Chieh Wang · Max Welling · Richard Zemel -
2018 Oral: Differentiable Compositional Kernel Learning for Gaussian Processes »
Shengyang Sun · Guodong Zhang · Chaoqi Wang · Wenyuan Zeng · Jiaman Li · Roger Grosse -
2017 Poster: Deep Spectral Clustering Learning »
Marc Law · Raquel Urtasun · Richard Zemel -
2017 Talk: Deep Spectral Clustering Learning »
Marc Law · Raquel Urtasun · Richard Zemel