Timezone: »

 
Algorithms for Optimal Adaptation of Diffusion Models to Reward Functions
Krishnamurthy Dvijotham · Shayegan Omidshafiei · Kimin Lee · Katie Collins · Deepak Ramachandran · Adrian Weller · Mohammad Ghavamzadeh · Milad Nasresfahani · Ying Fan · Jeremiah Liu
Event URL: https://openreview.net/forum?id=WRpRPsU0VT »

We develop algorithms for adapting pretrained diffusion models to optimize reward functions while retaining fidelity to the pretrained model. We propose a general framework for this adaptation that trades off fidelity to a pretrained diffusion model and achieving high reward. Our algorithms take advantage of the continuous nature of diffusion processes to pose reward-based learning either as a trajectory optimization or continuous state reinforcement learning problem. We demonstrate the efficacy of our approach across several application domains, including the generation of time series of household power consumption and images satisfying specific constraints like the absence of memorized images or corruptions.

Author Information

Krishnamurthy Dvijotham (Google DeepMind)
Shayegan Omidshafiei (DeepMind)
Kimin Lee (Google)
Katie Collins (University of Cambridge)
Deepak Ramachandran (Google)
Adrian Weller (University of Cambridge, Alan Turing Institute)
Adrian Weller

Adrian Weller is Programme Director for AI at The Alan Turing Institute, the UK national institute for data science and AI, and is a Turing AI Fellow leading work on trustworthy Machine Learning (ML). He is a Principal Research Fellow in ML at the University of Cambridge, and at the Leverhulme Centre for the Future of Intelligence where he is Programme Director for Trust and Society. His interests span AI, its commercial applications and helping to ensure beneficial outcomes for society. Previously, Adrian held senior roles in finance. He received a PhD in computer science from Columbia University, and an undergraduate degree in mathematics from Trinity College, Cambridge.

Mohammad Ghavamzadeh (Google Research)
Milad Nasresfahani (Google)
Ying Fan (UW-Madison)
Jeremiah Liu (Google Research)

More from the Same Authors