Skip to yearly menu bar Skip to main content


Empirical Study of the Benefits of Overparameterization in Learning Latent Variable Models

Rares-Darius Buhai · Yoni Halpern · Yoon Kim · Andrej Risteski · David Sontag


Keywords: [ Approximate Inference ] [ Graphical Models ] [ Unsupervised Learning ] [ Probabilistic Inference - Models and Probabilistic Programming ]


One of the most surprising and exciting discoveries in supervised learning was the benefit of overparameterization (i.e. training a very large model) to improving the optimization landscape of a problem, with minimal effect on statistical performance (i.e. generalization). In contrast, unsupervised settings have been under-explored, despite the fact that it was observed that overparameterization can be helpful as early as Dasgupta & Schulman (2007). We perform an empirical study of different aspects of overparameterization in unsupervised learning of latent variable models via synthetic and semi-synthetic experiments. We discuss benefits to different metrics of success (recovering the parameters of the ground-truth model, held-out log-likelihood), sensitivity to variations of the training algorithm, and behavior as the amount of overparameterization increases. We find that across a variety of models (noisy-OR networks, sparse coding, probabilistic context-free grammars) and training algorithms (variational inference, alternating minimization, expectation-maximization), overparameterization can significantly increase the number of ground truth latent variables recovered.

Chat is not available.