Oral
Tighter Variational Bounds are Not Necessarily Better
Tom Rainforth · Adam Kosiorek · Tuan Anh Le · Chris Maddison · Maximilian Igl · Frank Wood · Yee Whye Teh

Wed Jul 11th 03:10 -- 03:20 PM @ A4

We provide theoretical and empirical evidence that using tighter evidence lower bounds (ELBOs) can be detrimental to the process of learning an inference network by reducing the signal-to-noise ratio of the gradient estimator. Our results call into question common implicit assumptions that tighter ELBOs are better variational objectives for simultaneous model learning and inference amortization schemes. Based on our insights, we introduce three new algorithms: the partially importance weighted auto-encoder (PIWAE), the multiply importance weighted auto-encoder (MIWAE), and the combination importance weighted autoencoder (CIWAE), each of which includes the standard importance weighted auto-encoder (IWAE) as a special case. We show that each can deliver improvements over IWAE, even when performance is measured by the IWAE target itself. Furthermore, our results suggest that PIWAE may be able to deliver simultaneous improvements in the training of both the inference and generative networks.

Author Information

Tom Rainforth (University of Oxford)
Adam Kosiorek (University of Oxford)

I am a PhD student supervised by Ingmar Posner and Yee Whye Teh. I am interested in machine reasoning, and mostly in efficient inference in deep generative models, especially for timeseries. I am also excited by attention mechanisms and external memory for neural networks. I received an MSc in Computational Science & Engineering from the Technical University of Munich, where I worked on VAEs with Patrick van der Smagt. In my free time I train gymnastics and read lots of books.

Tuan Anh Le (University of Oxford)
Chris Maddison (University of Oxford)
Max Igl (University of Oxford)
Frank Wood (University of Oxford)

Dr. Wood is an associate professor in the Department of Engineering Science at the University of Oxford. Before that he was an assistant professor of Statistics at Columbia University and a research scientist at the Columbia Center for Computational Learning Systems. He formerly was a postdoctoral fellow of the Gatsby Computational Neuroscience Unit of the University College London. He holds a PhD from Brown University (’07) and BS from Cornell University (’96), both in computer science. Dr. Wood is the original architect of both the Anglican and Probabilistic-C probabilistic programming systems. He conducts AI-driven research at the boundary of probabilistic programming, Bayesian modeling, and Monte Carlo methods. Dr. Wood holds 6 patents, has authored over 50 papers, received the AISTATS best paper award in 2009, and has been awarded faculty research awards from Xerox, Google and Amazon. Prior to his academic career he was a successful entrepreneur having run and sold the content-based image retrieval company ToFish! to AOL/Time Warner and served as CEO of Interfolio.

Yee Whye Teh (Oxford and DeepMind)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors