Skip to yearly menu bar Skip to main content


Poster

Principled Gradient-Based MCMC for Conditional Sampling of Text

Li Du · Afra Amini · Lucas Torroba Hennigen · Xinyan Velocity Yu · Holden Lee · Jason Eisner · Ryan Cotterell

Hall C 4-9 #2617
[ ]
Tue 23 Jul 2:30 a.m. PDT — 4 a.m. PDT

Abstract:

We consider the problem of sampling text from an energy-based model. This arises, for example, when sampling text from a neural language model subject to soft constraints. Although the target distribution is discrete, the internal computations of the energy function (given by the language model) are differentiable, so one would like to exploit gradient information within a method such as MCMC. Alas, all previous attempts to generalize gradient-based MCMC to text sampling fail to sample correctly from the target distribution. We propose a solution, along with variants, and study its theoretical properties. Through experiments on various forms of text generation, we demonstrate that our unbiased samplers are able to generate more fluent text while better adhering to the control objectives. The same methods could be used to sample from discrete energy-based models unrelated to text.

Live content is unavailable. Log in and register to view live content