Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Structured Probabilistic Inference and Generative Modeling

Policy Gradients for Optimal Parallel Tempering MCMC

Daniel Zhao · Natesh Pillai

Keywords: [ MCMC ] [ Reinforcement Learning ] [ policy gradient ] [ ICML ] [ Parallel tempering ]


Abstract:

Parallel tempering is a meta-algorithm for Markov Chain Monte Carlo (MCMC) methods which uses multiple chains to sample from tempered versions of the target distribution, improving mixing on multi-modal distributions that are difficult to explore for traditional methods. The success of this technique depends critically on the choice of chain temperatures. We introduce an adaptive temperature selection algorithm which adjusts temperatures during sampling using a policy gradient method. Experimental results show that it can outperform traditional geometrically-spaced temperatures and uniform acceptance rate temperature ladders in terms of integrated autocorrelation time on some distributions.

Chat is not available.