Talk

Improved Variational Autoencoders for Text Modeling using Dilated Convolutions

Zichao Yang ⋅ Zhiting Hu ⋅ Ruslan Salakhutdinov ⋅ Taylor Berg-Kirkpatrick

2017 Talk

[ PDF] [

Summary/Notes] [ Video]

Abstract

Recent work on generative text modeling has found that variational autoencoders (VAE) with LSTM decoders perform worse than simpler LSTM language models (Bowman et al., 2015). This negative result is so far poorly understood, but has been attributed to the propensity of LSTM decoders to ignore conditioning informa- tion from the encoder. In this paper, we ex- periment with a new type of decoder for VAE: a dilated CNN. By changing the decoder’s di- lation architecture, we control the size of con- text from previously generated words. In ex- periments, we find that there is a trade-off be- tween contextual capacity of the decoder and ef- fective use of encoding information. We show that when carefully managed, VAEs can outper- form LSTM language models. We demonstrate perplexity gains on two datasets, representing the first positive language modeling result with VAE. Further, we conduct an in-depth investigation of the use of VAE (with our new decoding archi- tecture) for semi-supervised and unsupervised la- beling tasks, demonstrating gains over several strong baselines.

Chat is not available.