Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Structured Probabilistic Inference and Generative Modeling

Fit Like You Sample: Sample-Efficient Generalized Score Matching from Fast Mixing Markov Chains

Yilong Qin · Andrej Risteski

Keywords: [ annealing ] [ Theory ] [ score matching ] [ sample complexity ]


Abstract: Score matching is an approach to learning probability distributions parametrized up to a constant of proportionality (e.g. EBMs). The idea is to fit the score of the distribution (i.e. $\nabla_x \log p(x)$), rather than the likelihood, thus avoiding the need to evaluate the constant of proportionality. While there's a clear algorithmic benefit, the statistical "cost" can be steep: recent work by Koehler et al '23 showed that for distributions that have poor isoperimetric properties (a large Poincare or log-Sobolev constant), score matching is substantially statistically less efficient than maximum likelihood. However, many natural realistic distributions, e.g. multimodal distributions as simple as a mixture of two Gaussians---even in one dimension---have a poor Poincare constant.In this paper, we show a close connection between the mixing time of an arbitrary Markov process with generator $\mathcal{L}$ and a generalized score matching loss that tries to fit $\frac{\mathcal{O}p}{p}$. We instantiate this framework with several examples. In the special case of $\mathcal{O} = \nabla_x$, and $\mathcal{L}$ being the generator of Langevin diffusion, this generalizes and recovers the results from Koehler et al '23. If $\mathcal{L}$ corresponds to a Markov process corresponding to a continuous version of simulated tempering, we show the corresponding generalized score matching loss is a Gaussian-convolution annealed score matching loss, akin to the one proposed in Song-Ermon '19. Moreover, we show that if the distribution being learned is a mixture of $K$ Gaussians in $d$ dimensions, the sample complexity of annealed score matching is polynomial in $d$ and $K$ --- obviating the Poincar'e constant-based lower bounds of the basic score matching loss shown in Koehler et al. This is the first result characterizing the benefits of annealing for score matching---a crucial component in more sophisticated score-based approaches like Song-Ermon '19.

Chat is not available.