Amortized variational inference (AVI) replaces instance-specific local inference with a global inference network. While AVI has enabled efficient training of deep generative models such as variational autoencoders (VAE), recent empirical work suggests that inference networks can produce suboptimal variational parameters. We propose a hybrid approach, to use AVI to initialize the variational parameters and run stochastic variational inference (SVI) to refine them. Crucially, the local SVI procedure is itself differentiable, so the inference network and generative model can be trained end-to-end with gradient-based optimization. This semi-amortized approach enables the use of rich generative models without experiencing the posterior-collapse phenomenon common in training VAEs for problems like text generation. Experiments show this approach outperforms strong autoregressive and variational baselines on standard text and image datasets.
Author Information
Yoon Kim (Harvard University)
Sam Wiseman (Harvard University)
Andrew Miller (Harvard)
David Sontag (Massachusetts Institute of Technology)
Alexander Rush (Harvard University)
Related Events (a corresponding poster, oral, or spotlight)
-
2018 Oral: Semi-Amortized Variational Autoencoders »
Fri Jul 13th 11:20 -- 11:40 AM Room A7
More from the Same Authors
-
2019 Poster: Latent Normalizing Flows for Discrete Sequences »
Zachary Ziegler · Alexander Rush -
2019 Oral: Latent Normalizing Flows for Discrete Sequences »
Zachary Ziegler · Alexander Rush -
2019 Poster: Counterfactual Off-Policy Evaluation with Gumbel-Max Structural Causal Models »
Michael Oberst · David Sontag -
2019 Poster: Tensor Variable Elimination for Plated Factor Graphs »
Fritz Obermeyer · Elias Bingham · Martin Jankowiak · Neeraj Pradhan · Justin Chiu · Alexander Rush · Noah Goodman -
2019 Oral: Tensor Variable Elimination for Plated Factor Graphs »
Fritz Obermeyer · Elias Bingham · Martin Jankowiak · Neeraj Pradhan · Justin Chiu · Alexander Rush · Noah Goodman -
2019 Oral: Counterfactual Off-Policy Evaluation with Gumbel-Max Structural Causal Models »
Michael Oberst · David Sontag -
2018 Poster: Weightless: Lossy weight encoding for deep neural network compression »
Brandon Reagen · Udit Gupta · Bob Adolf · Michael Mitzenmacher · Alexander Rush · Gu-Yeon Wei · David Brooks -
2018 Poster: Adversarially Regularized Autoencoders »
Jake Zhao · Yoon Kim · Kelly Zhang · Alexander Rush · Yann LeCun -
2018 Oral: Weightless: Lossy weight encoding for deep neural network compression »
Brandon Reagen · Udit Gupta · Bob Adolf · Michael Mitzenmacher · Alexander Rush · Gu-Yeon Wei · David Brooks -
2018 Oral: Adversarially Regularized Autoencoders »
Jake Zhao · Yoon Kim · Kelly Zhang · Alexander Rush · Yann LeCun -
2017 Poster: Image-to-Markup Generation with Coarse-to-Fine Attention »
Yuntian Deng · Anssi Kanervisto · Jeffrey Ling · Alexander Rush -
2017 Poster: Estimating individual treatment effect: generalization bounds and algorithms »
Uri Shalit · Fredrik D Johansson · David Sontag -
2017 Talk: Estimating individual treatment effect: generalization bounds and algorithms »
Uri Shalit · Fredrik D Johansson · David Sontag -
2017 Talk: Image-to-Markup Generation with Coarse-to-Fine Attention »
Yuntian Deng · Anssi Kanervisto · Jeffrey Ling · Alexander Rush -
2017 Poster: Variational Boosting: Iteratively Refining Posterior Approximations »
Andrew Miller · Nicholas J Foti · Ryan P. Adams -
2017 Talk: Variational Boosting: Iteratively Refining Posterior Approximations »
Andrew Miller · Nicholas J Foti · Ryan P. Adams -
2017 Poster: Simultaneous Learning of Trees and Representations for Extreme Classification and Density Estimation »
Yacine Jernite · Anna Choromanska · David Sontag -
2017 Talk: Simultaneous Learning of Trees and Representations for Extreme Classification and Density Estimation »
Yacine Jernite · Anna Choromanska · David Sontag