Track: Generative Models 1

Wed 11 July 7:00 - 7:20 PDT

Which Training Methods for GANs do actually Converge?

Lars Mescheder · Andreas Geiger · Sebastian Nowozin

Recent work has shown local convergence of GAN training for absolutely continuous data and generator distributions. In this paper, we show that the requirement of absolute continuity is necessary: we describe a simple yet prototypical counterexample showing that in the more realistic case of distributions that are not absolutely continuous, unregularized GAN training is not always convergent. Furthermore, we discuss regularization strategies that were recently proposed to stabilize GAN training. Our analysis shows that GAN training with instance noise or zero-centered gradient penalties converges. On the other hand, we show that Wasserstein-GANs and WGAN-GP with a finite number of discriminator updates per generator update do not always converge to the equilibrium point. We discuss these results, leading us to a new explanation for the stability problems of GAN training. Based on our analysis, we extend our convergence results to more general GANs and prove local convergence for simplified gradient penalties even if the generator and data distributions lie on lower dimensional manifolds. We find these penalties to work well in practice and use them to learn high-resolution generative image models for a variety of datasets with little hyperparameter tuning.

Wed 11 July 7:20 - 7:40 PDT

Chi-square Generative Adversarial Network

Chenyang Tao · Liqun Chen · Ricardo Henao · Jianfeng Feng · Lawrence Carin

To assess the difference between real and synthetic data, Generative Adversarial Networks (GANs) are trained using a distribution discrepancy measure. Three widely employed measures are information-theoretic divergences, integral probability metrics, and Hilbert space discrepancy metrics. We elucidate the theoretical connections between these three popular GAN training criteria and propose a novel procedure, called $\chi^2$ (Chi-square) GAN, that is conceptually simple, stable at training and resistant to mode collapse. Our procedure naturally generalizes to address the problem of simultaneous matching of multiple distributions. Further, we propose a resampling strategy that significantly improves sample quality, by repurposing the trained critic function via an importance weighting mechanism. Experiments show that the proposed procedure improves stability and convergence, and yields state-of-art results on a wide range of generative modeling tasks.

Wed 11 July 7:40 - 7:50 PDT

Learning Implicit Generative Models with the Method of Learned Moments

Suman Ravuri · Shakir Mohamed · Mihaela Rosca · Oriol Vinyals

We propose a method of moments (MoM) algorithm for training large-scale implicit generative models. Moment estimation in this setting encounters two problems: it is often difficult to define the millions of moments needed to learn the model parameters, and it is hard to determine which properties are useful when specifying moments. To address the first issue, we introduce a moment network, and define the moments as the network's hidden units and the gradient of the network's output with respect to its parameters. To tackle the second problem, we use asymptotic theory to highlight desiderata for moments -- namely they should minimize the asymptotic variance of estimated model parameters -- and introduce an objective to learn better moments. The sequence of objectives created by this Method of Learned Moments (MoLM) can train high-quality neural image samplers. On CIFAR-10, we demonstrate that MoLM-trained generators achieve significantly higher Inception Scores and lower Frechet Inception Distances than those trained with gradient penalty-regularized and spectrally-normalized adversarial objectives. These generators also achieve nearly perfect Multi-Scale Structural Similarity Scores on CelebA, and can create high-quality samples of 128x128 images.

Wed 11 July 7:50 - 8:00 PDT

A Classification-Based Study of Covariate Shift in GAN Distributions

Shibani Santurkar · Ludwig Schmidt · Aleksander Madry

A basic, and still largely unanswered, question in the context of Generative Adversarial Networks (GANs) is whether they are truly able to capture all the fundamental characteristics of the distributions they are trained on. In particular, evaluating the diversity of GAN distributions is challenging and existing methods provide only a partial understanding of this issue. In this paper, we develop quantitative and scalable tools for assessing the diversity of GAN distributions. Specifically, we take a classification-based perspective and view loss of diversity as a form of covariate shift introduced by GANs. We examine two specific forms of such shift: mode collapse and boundary distortion. In contrast to prior work, our methods need only minimal human supervision and can be readily applied to state-of-the-art GANs on large, canonical datasets. Examining popular GANs using our tools indicates that these GANs have significant problems in reproducing the more distributional properties of their training dataset.

Main Navigation

Session

Generative Models 1

Which Training Methods for GANs do actually Converge?

Chi-square Generative Adversarial Network

Learning Implicit Generative Models with the Method of Learned Moments

A Classification-Based Study of Covariate Shift in GAN Distributions