Poster
in
Workshop: Structured Probabilistic Inference and Generative Modeling
Policy Gradients for Optimal Parallel Tempering MCMC
Daniel Zhao · Natesh Pillai
Keywords: [ MCMC ] [ Reinforcement Learning ] [ policy gradient ] [ ICML ] [ Parallel tempering ]
Parallel tempering is a meta-algorithm for Markov Chain Monte Carlo (MCMC) methods which uses multiple chains to sample from tempered versions of the target distribution, improving mixing on multi-modal distributions that are difficult to explore for traditional methods. The success of this technique depends critically on the choice of chain temperatures. We introduce an adaptive temperature selection algorithm which adjusts temperatures during sampling using a policy gradient method. Experimental results show that it can outperform traditional geometrically-spaced temperatures and uniform acceptance rate temperature ladders in terms of integrated autocorrelation time on some distributions.