Skip to yearly menu bar Skip to main content


Long Horizon Temperature Scaling

Andy Shih · Dorsa Sadigh · Stefano Ermon

Exhibit Hall 1 #535
[ ]
[ PDF [ Poster

Abstract: Temperature scaling is a popular technique for tuning the sharpness of a model distribution. It is used extensively for sampling likely generations and calibrating model uncertainty, and even features as a controllable parameter to many large language models in deployment. However, autoregressive models rely on myopic temperature scaling that greedily optimizes the next token. To address this, we propose *Long Horizon Temperature Scaling* (LHTS), a novel approach for sampling from temperature-scaled *joint* distributions. LHTS is compatible with all likelihood-based models, and optimizes for the long-horizon likelihood of samples. We derive a temperature-dependent LHTS objective, and show that fine-tuning a model on a range of temperatures produces a single model capable of generation with a controllable long-horizon temperature parameter. We experiment with LHTS on image diffusion models and character/language autoregressive models, demonstrating its advantages over myopic temperature scaling in likelihood and sample quality, and showing improvements in accuracy of a multiple choice analogy by $10$%.

Chat is not available.