Variational Autoencoders (VAE) and their variants have been widely used in a variety of applications, such as dialog generation, image generation and disentangled representation learning. However, the existing VAE models may suffer from KL vanishing in language modeling and low reconstruction quality for disentangling. To address these issues, we propose a novel controllable variational autoencoder framework, ControlVAE, that combines a controller, inspired by automatic control theory, with the basic VAE to improve the performance of resulting generative models. Specifically, we design a new non-linear PI controller, a variant of the proportional-integral-derivative (PID) control, to automatically tune the hyperparameter (weight) added in the VAE objective using the output KL-divergence as feedback during model training. The framework is evaluated using three applications; namely, language modeling, disentangled representation learning, and image generation. The results show that ControlVAE can achieve much better reconstruction quality than the competitive methods for the comparable disentanglement performance. For language modelling, it not only averts the KL-vanishing, but also improves the diversity of generated text. Finally, we also demonstrate that ControlVAE improves the reconstruction quality for image generation compared to the original VAE.
Huajie Shao (University of Illinois at Urbana-Champaign)
Shuochao Yao (University of Illinois at Urbana-Champaign)
Dachun Sun (University of Illinois at Urbana-Champaign)
Aston Zhang (AWS AI)
Aston Zhang is a senior scientist at Amazon Web Services AI. His research interests are in deep learning. He received a Ph.D. in computer science from University of Illinois at Urbana-Champaign. He has served as an editorial board member for Frontiers in Big Data and a program committee member (reviewer) for ICML, NeurIPS, ICLR, WWW, KDD, SIGIR, and WSDM. His book Dive into Deep Learning (www.d2l.ai) has been used as a textbook in worldwide universities.
Shengzhong Liu (University of Illinois at Urbana-Champaign)
Dongxin Liu (University of Illinois at Urbana-Champaign)
Jun Wang (Alibaba Group)
Tarek Abdelzaher (University of Illinois at Urbana-Champaign)
More from the Same Authors
2019 Tutorial: A Tutorial on Attention in Deep Learning »
Alex Smola · Aston Zhang