Track: Deep Generative Model 2

Tue 20 July 7:00 - 7:20 PDT

Oral

Spectral Smoothing Unveils Phase Transitions in Hierarchical Variational Autoencoders

Adeel Pervez · Efstratios Gavves

Variational autoencoders with deep hierarchies of stochastic layers have been known to suffer from the problem of posterior collapse, where the top layers fall back to the prior and become independent of input. We suggest that the hierarchical VAE objective explicitly includes the variance of the function parameterizing the mean and variance of the latent Gaussian distribution which itself is often a high variance function. Building on this we generalize VAE neural networks by incorporating a smoothing parameter motivated by Gaussian analysis to reduce higher frequency components and consequently the variance in parameterizing functions and show that this can help to solve the problem of posterior collapse. We further show that under such smoothing the VAE loss exhibits a phase transition, where the top layer KL divergence sharply drops to zero at a critical value of the smoothing parameter that is similar for the same model across datasets. We validate the phenomenon across model configurations and datasets.

Tue 20 July 7:20 - 7:25 PDT

Spotlight

Riemannian Convex Potential Maps

samuel cohen · Brandon Amos · Yaron Lipman

Modeling distributions on Riemannian manifolds is a crucial component in understanding non-Euclidean data that arises, e.g., in physics and geology. The budding approaches in this space are limited by representational and computational tradeoffs. We propose and study a class of flows that uses convex potentials from Riemannian optimal transport. These are universal and can model distributions on any compact Riemannian manifold without requiring domain knowledge of the manifold to be integrated into the architecture. We demonstrate that these flows can model standard distributions on spheres, and tori, on synthetic and geological data.

Tue 20 July 7:25 - 7:30 PDT

Spotlight

Autoencoding Under Normalization Constraints

Sangwoong Yoon · Yung-Kyun Noh · Frank Chongwoo Park

Likelihood is a standard estimate for outlier detection. The specific role of the normalization constraint is to ensure that the out-of-distribution (OOD) regime has a small likelihood when samples are learned using maximum likelihood. Because autoencoders do not possess such a process of normalization, they often fail to recognize outliers even when they are obviously OOD. We propose the Normalized Autoencoder (NAE), a normalized probabilistic model constructed from an autoencoder. The probability density of NAE is defined using the reconstruction error of an autoencoder, which is differently defined in the conventional energy-based model. In our model, normalization is enforced by suppressing the reconstruction of negative samples, significantly improving the outlier detection performance. Our experimental results confirm the efficacy of NAE, both in detecting outliers and in generating in-distribution samples.

Tue 20 July 7:30 - 7:35 PDT

Spotlight

PixelTransformer: Sample Conditioned Signal Generation

Shubham Tulsiani · Abhinav Gupta

We propose a generative model that can infer a distribution for the underlying spatial signal conditioned on sparse samples e.g. plausible images given a few observed pixels. In contrast to sequential autoregressive generative models, our model allows conditioning on arbitrary samples and can answer distributional queries for any location. We empirically validate our approach across three image datasets and show that we learn to generate diverse and meaningful samples, with the distribution variance reducing given more observed pixels. We also show that our approach is applicable beyond images and can allow generating other types of spatial outputs e.g. polynomials, 3D shapes, and videos.

Tue 20 July 7:35 - 7:40 PDT

Spotlight

Generative Adversarial Networks for Markovian Temporal Dynamics: Stochastic Continuous Data Generation

Sung Woo Park · Dong Wook Shu · Junseok Kwon

In this paper, we present a novel generative adversarial network (GAN) that can describe Markovian temporal dynamics. To generate stochastic sequential data, we introduce a novel stochastic differential equation-based conditional generator and spatial-temporal constrained discriminator networks. To stabilize the learning dynamics of the min-max type of the GAN objective function, we propose well-posed constraint terms for both networks. We also propose a novel conditional Markov Wasserstein distance to induce a pathwise Wasserstein distance. The experimental results demonstrate that our method outperforms state-of-the-art methods using several different types of data.

Tue 20 July 7:40 - 7:45 PDT

Spotlight

Autoencoder Image Interpolation by Shaping the Latent Space

Alon Oring · Zohar Yakhini · Yacov Hel-Or

One of the fascinating properties of deep learning is the ability of the network to reveal the underlying factors characterizing elements in datasets of different types. Autoencoders represent an effective approach for computing these factors. Autoencoders have been studied in the context of enabling interpolation between data points by decoding convex combinations of latent vectors. However, this interpolation often leads to artifacts or produces unrealistic results during reconstruction. We argue that these incongruities are due to the structure of the latent space and to the fact that such naively interpolated latent vectors deviate from the data manifold. In this paper, we propose a regularization technique that shapes the latent representation to follow a manifold that is consistent with the training images and that forces the manifold to be smooth and locally convex. This regularization not only enables faithful interpolation between data points, as we show herein but can also be used as a general regularization technique to avoid overfitting or to produce new samples for data augmentation.

Tue 20 July 7:45 - 7:50 PDT

Spotlight

Improved Denoising Diffusion Probabilistic Models

Alexander Nichol · Prafulla Dhariwal

Denoising diffusion probabilistic models (DDPM) are a class of generative models which have recently been shown to produce excellent samples. We show that with a few simple modifications, DDPMs can also achieve competitive log-likelihoods while maintaining high sample quality. Additionally, we find that learning variances of the reverse diffusion process allows sampling with an order of magnitude fewer forward passes with a negligible difference in sample quality, which is important for the practical deployment of these models. We additionally use precision and recall to compare how well DDPMs and GANs cover the target distribution. Finally, we show that the sample quality and likelihood of these models scale smoothly with model capacity and training compute, making them easily scalable. We release our code and pre-trained models at https://github.com/openai/improved-diffusion.

Tue 20 July 7:50 - 7:55 PDT

Q&A

Main Navigation

Session

Deep Generative Model 2

Spectral Smoothing Unveils Phase Transitions in Hierarchical Variational Autoencoders

Riemannian Convex Potential Maps

Autoencoding Under Normalization Constraints

PixelTransformer: Sample Conditioned Signal Generation

Generative Adversarial Networks for Markovian Temporal Dynamics: Stochastic Continuous Data Generation

Autoencoder Image Interpolation by Shaping the Latent Space

Improved Denoising Diffusion Probabilistic Models

Q&A