Timezone: »
We develop algorithms for adapting pretrained diffusion models to optimize reward functions while retaining fidelity to the pretrained model. We propose a general framework for this adaptation that trades off fidelity to a pretrained diffusion model and achieving high reward. Our algorithms take advantage of the continuous nature of diffusion processes to pose reward-based learning either as a trajectory optimization or continuous state reinforcement learning problem. We demonstrate the efficacy of our approach across several application domains, including the generation of time series of household power consumption and images satisfying specific constraints like the absence of memorized images or corruptions.
Author Information
Krishnamurthy Dvijotham (Google DeepMind)
Shayegan Omidshafiei (DeepMind)
Kimin Lee (Google)
Katie Collins (University of Cambridge)
Deepak Ramachandran (Google)
Adrian Weller (University of Cambridge, Alan Turing Institute)

Adrian Weller is Programme Director for AI at The Alan Turing Institute, the UK national institute for data science and AI, and is a Turing AI Fellow leading work on trustworthy Machine Learning (ML). He is a Principal Research Fellow in ML at the University of Cambridge, and at the Leverhulme Centre for the Future of Intelligence where he is Programme Director for Trust and Society. His interests span AI, its commercial applications and helping to ensure beneficial outcomes for society. Previously, Adrian held senior roles in finance. He received a PhD in computer science from Columbia University, and an undergraduate degree in mathematics from Trinity College, Cambridge.
Mohammad Ghavamzadeh (Google Research)
Milad Nasresfahani (Google)
Ying Fan (UW-Madison)
Jeremiah Liu (Google Research)
More from the Same Authors
-
2021 : Diverse and Amortised Counterfactual Explanations for Uncertainty Estimates »
· Dan Ley · Umang Bhatt · Adrian Weller -
2021 : Diverse and Amortised Counterfactual Explanations for Uncertainty Estimates »
Dan Ley · Umang Bhatt · Adrian Weller -
2021 : On the Fairness of Causal Algorithmic Recourse »
Julius von Kügelgen · Amir-Hossein Karimi · Umang Bhatt · Isabel Valera · Adrian Weller · Bernhard Schölkopf · Amir-Hossein Karimi -
2021 : Towards Principled Disentanglement for Domain Generalization »
Hanlin Zhang · Yi-Fan Zhang · Weiyang Liu · Adrian Weller · Bernhard Schölkopf · Eric Xing -
2021 : Diverse and Amortised Counterfactual Explanations for Uncertainty Estimates »
Dan Ley · Umang Bhatt · Adrian Weller -
2021 : CrossWalk: Fairness-enhanced Node Representation Learning »
Ahmad Khajehnejad · Moein Khajehnejad · Krishna Gummadi · Adrian Weller · Baharan Mirzasoleiman -
2022 : Non-stationary Bandits and Meta-Learning with a Small Set of Optimal Arms »
MohammadJavad Azizi · Thang Duong · Yasin Abbasi-Yadkori · Claire Vernade · Andras Gyorgy · Mohammad Ghavamzadeh -
2022 : Plex: Towards Reliability using Pretrained Large Model Extensions »
Dustin Tran · Andreas Kirsch · Balaji Lakshminarayanan · Huiyi Hu · Du Phan · D. Sculley · Jasper Snoek · Jeremiah Liu · Jie Ren · Joost van Amersfoort · Kehang Han · E. Kelly Buchanan · Kevin Murphy · Mark Collier · Mike Dusenberry · Neil Band · Nithum Thain · Rodolphe Jenatton · Tim G. J Rudner · Yarin Gal · Zachary Nado · Zelda Mariet · Zi Wang · Zoubin Ghahramani -
2022 : Plex: Towards Reliability using Pretrained Large Model Extensions »
Dustin Tran · Andreas Kirsch · Balaji Lakshminarayanan · Huiyi Hu · Du Phan · D. Sculley · Jasper Snoek · Jeremiah Liu · JIE REN · Joost van Amersfoort · Kehang Han · Estefany Kelly Buchanan · Kevin Murphy · Mark Collier · Michael Dusenberry · Neil Band · Nithum Thain · Rodolphe Jenatton · Tim G. J Rudner · Yarin Gal · Zachary Nado · Zelda Mariet · Zi Wang · Zoubin Ghahramani -
2022 : Perspectives on Incorporating Expert Feedback into Model Updates »
Valerie Chen · Umang Bhatt · Hoda Heidari · Adrian Weller · Ameet Talwalkar -
2023 : Guide Your Agent with Adaptive Multimodal Rewards »
Changyeon Kim · Younggyo Seo · Hao Liu · Lisa Lee · Jinwoo Shin · Honglak Lee · Kimin Lee -
2023 : Privacy Auditing with One (1) Training Run »
Thomas Steinke · Milad Nasresfahani · Matthew Jagielski -
2023 : Tackling Provably Hard Representative Selection viaGraph Neural Networks »
Mehran Kazemi · Anton Tsitsulin · Hossein Esfandiari · Mohammad Hossein Bateni · Deepak Ramachandran · Bryan Perozzi · Vahab Mirrokni -
2023 : The Neuro-Symbolic Inverse Planning Engine (NIPE): Modeling probabilistic social inferences from linguistic inputs »
Lance Ying · Katie Collins · Megan Wei · Cedegao Zhang · Tan Zhi-Xuan · Adrian Weller · Josh Tenenbaum · Catherine Wong -
2023 Workshop: The Many Facets of Preference-Based Learning »
Aadirupa Saha · Mohammad Ghavamzadeh · Robert Busa-Fekete · Branislav Kveton · Viktor Bengs -
2023 Oral: Simplex Random Features »
Isaac Reid · Krzysztof Choromanski · Valerii Likhosherstov · Adrian Weller -
2023 Poster: Efficient Graph Field Integrators Meet Point Clouds »
Krzysztof Choromanski · Arijit Sehanobish · Han Lin · YUNFAN ZHAO · Eli Berger · Tetiana Parshakova · Qingkai Pan · David Watkins · Tianyi Zhang · Valerii Likhosherstov · Somnath Basu Roy Chowdhury · Kumar Avinava Dubey · Deepali Jain · Tamas Sarlos · Snigdha Chaturvedi · Adrian Weller -
2023 Poster: Simplex Random Features »
Isaac Reid · Krzysztof Choromanski · Valerii Likhosherstov · Adrian Weller -
2023 Poster: Why Is Public Pretraining Necessary for Private Model Training? »
Arun Ganesh · Mahdi Haghifam · Milad Nasresfahani · Sewoong Oh · Thomas Steinke · Om Thakkar · Abhradeep Guha Thakurta · Lun Wang -
2023 Poster: Multi-Task Off-Policy Learning from Bandit Feedback »
Joey Hong · Branislav Kveton · Manzil Zaheer · Sumeet Katariya · Mohammad Ghavamzadeh -
2023 Poster: Effectively Using Public Data in Privacy Preserving Machine Learning »
Milad Nasresfahani · Saeed Mahloujifar · Xinyu Tang · Prateek Mittal · Amir Houmansadr -
2023 Poster: Controllability-Aware Unsupervised Skill Discovery »
Seohong Park · Kimin Lee · Youngwoon Lee · Pieter Abbeel -
2023 Poster: A Simple Zero-shot Prompt Weighting Technique to Improve Prompt Ensembling in Text-Image Models »
James Allingham · JIE REN · Michael Dusenberry · Xiuye Gu · Yin Cui · Dustin Tran · Jeremiah Liu · Balaji Lakshminarayanan -
2023 Poster: Is Learning Summary Statistics Necessary for Likelihood-free Inference? »
Yanzhi Chen · Michael Gutmann · Adrian Weller -
2023 Poster: Multi-View Masked World Models for Visual Robotic Manipulation »
Younggyo Seo · Junsu Kim · Stephen James · Kimin Lee · Jinwoo Shin · Pieter Abbeel -
2023 Poster: Optimizing DDPM Sampling with Shortcut Fine-Tuning »
Ying Fan · Kangwook Lee -
2022 : Plex: Towards Reliability using Pretrained Large Model Extensions »
Dustin Tran · Andreas Kirsch · Balaji Lakshminarayanan · Huiyi Hu · Du Phan · D. Sculley · Jasper Snoek · Jeremiah Liu · JIE REN · Joost van Amersfoort · Kehang Han · Estefany Kelly Buchanan · Kevin Murphy · Mark Collier · Michael Dusenberry · Neil Band · Nithum Thain · Rodolphe Jenatton · Tim G. J Rudner · Yarin Gal · Zachary Nado · Zelda Mariet · Zi Wang · Zoubin Ghahramani -
2022 : Spotlight Presentations »
Adrian Weller · Osbert Bastani · Jake Snell · Tal Schuster · Stephen Bates · Zhendong Wang · Margaux Zaffran · Danielle Rasooly · Varun Babbar -
2022 Workshop: Workshop on Human-Machine Collaboration and Teaming »
Umang Bhatt · Katie Collins · Maria De-Arteaga · Bradley Love · Adrian Weller -
2022 Poster: From block-Toeplitz matrices to differential equations on graphs: towards a general theory for scalable masked Transformers »
Krzysztof Choromanski · Han Lin · Haoxian Chen · Tianyi Zhang · Arijit Sehanobish · Valerii Likhosherstov · Jack Parker-Holder · Tamas Sarlos · Adrian Weller · Thomas Weingarten -
2022 Poster: Deep Hierarchy in Bandits »
Joey Hong · Branislav Kveton · Sumeet Katariya · Manzil Zaheer · Mohammad Ghavamzadeh -
2022 Poster: Measuring Representational Robustness of Neural Networks Through Shared Invariances »
Vedant Nanda · Till Speicher · Camila Kolling · John P Dickerson · Krishna Gummadi · Adrian Weller -
2022 Spotlight: Deep Hierarchy in Bandits »
Joey Hong · Branislav Kveton · Sumeet Katariya · Manzil Zaheer · Mohammad Ghavamzadeh -
2022 Oral: Measuring Representational Robustness of Neural Networks Through Shared Invariances »
Vedant Nanda · Till Speicher · Camila Kolling · John P Dickerson · Krishna Gummadi · Adrian Weller -
2022 Spotlight: From block-Toeplitz matrices to differential equations on graphs: towards a general theory for scalable masked Transformers »
Krzysztof Choromanski · Han Lin · Haoxian Chen · Tianyi Zhang · Arijit Sehanobish · Valerii Likhosherstov · Jack Parker-Holder · Tamas Sarlos · Adrian Weller · Thomas Weingarten -
2022 Poster: Feature and Parameter Selection in Stochastic Linear Bandits »
Ahmadreza Moradipari · Berkay Turan · Yasin Abbasi-Yadkori · Mahnoosh Alizadeh · Mohammad Ghavamzadeh -
2022 Spotlight: Feature and Parameter Selection in Stochastic Linear Bandits »
Ahmadreza Moradipari · Berkay Turan · Yasin Abbasi-Yadkori · Mahnoosh Alizadeh · Mohammad Ghavamzadeh -
2022 Poster: POEM: Out-of-Distribution Detection with Posterior Sampling »
Yifei Ming · Ying Fan · Sharon Li -
2022 Oral: POEM: Out-of-Distribution Detection with Posterior Sampling »
Yifei Ming · Ying Fan · Sharon Li -
2021 Poster: Model-based Reinforcement Learning for Continuous Control with Posterior Sampling »
Ying Fan · Yifei Ming -
2021 Poster: Debiasing a First-order Heuristic for Approximate Bi-level Optimization »
Valerii Likhosherstov · Xingyou Song · Krzysztof Choromanski · Jared Quincy Davis · Adrian Weller -
2021 Oral: Model-based Reinforcement Learning for Continuous Control with Posterior Sampling »
Ying Fan · Yifei Ming -
2021 Spotlight: Debiasing a First-order Heuristic for Approximate Bi-level Optimization »
Valerii Likhosherstov · Xingyou Song · Krzysztof Choromanski · Jared Quincy Davis · Adrian Weller -
2021 Poster: From Poincaré Recurrence to Convergence in Imperfect Information Games: Finding Equilibrium via Regularization »
Julien Perolat · Remi Munos · Jean-Baptiste Lespiau · Shayegan Omidshafiei · Mark Rowland · Pedro Ortega · Neil Burch · Thomas Anthony · David Balduzzi · Bart De Vylder · Georgios Piliouras · Marc Lanctot · Karl Tuyls -
2021 Spotlight: From Poincaré Recurrence to Convergence in Imperfect Information Games: Finding Equilibrium via Regularization »
Julien Perolat · Remi Munos · Jean-Baptiste Lespiau · Shayegan Omidshafiei · Mark Rowland · Pedro Ortega · Neil Burch · Thomas Anthony · David Balduzzi · Bart De Vylder · Georgios Piliouras · Marc Lanctot · Karl Tuyls -
2021 Poster: PID Accelerated Value Iteration Algorithm »
Amir-massoud Farahmand · Mohammad Ghavamzadeh -
2021 Spotlight: PID Accelerated Value Iteration Algorithm »
Amir-massoud Farahmand · Mohammad Ghavamzadeh -
2020 Workshop: 5th ICML Workshop on Human Interpretability in Machine Learning (WHI) »
Adrian Weller · Alice Xiang · Amit Dhurandhar · Been Kim · Dennis Wei · Kush Varshney · Umang Bhatt -
2020 Poster: Stochastic Flows and Geometric Optimization on the Orthogonal Group »
Krzysztof Choromanski · David Cheikhi · Jared Quincy Davis · Valerii Likhosherstov · Achille Nazaret · Achraf Bahamou · Xingyou Song · Mrugank Akarte · Jack Parker-Holder · Jacob Bergquist · Yuan Gao · Aldo Pacchiano · Tamas Sarlos · Adrian Weller · Vikas Sindhwani -
2020 Poster: Fast computation of Nash Equilibria in Imperfect Information Games »
Remi Munos · Julien Perolat · Jean-Baptiste Lespiau · Mark Rowland · Bart De Vylder · Marc Lanctot · Finbarr Timbers · Daniel Hennes · Shayegan Omidshafiei · Audrunas Gruslys · Mohammad Gheshlaghi Azar · Edward Lockhart · Karl Tuyls -
2019 Workshop: Human In the Loop Learning (HILL) »
Xin Wang · Xin Wang · Fisher Yu · Shanghang Zhang · Joseph Gonzalez · Yangqing Jia · Sarah Bird · Kush Varshney · Been Kim · Adrian Weller -
2019 Poster: Unifying Orthogonal Monte Carlo Methods »
Krzysztof Choromanski · Mark Rowland · Wenyu Chen · Adrian Weller -
2019 Oral: Unifying Orthogonal Monte Carlo Methods »
Krzysztof Choromanski · Mark Rowland · Wenyu Chen · Adrian Weller -
2019 Poster: TibGM: A Transferable and Information-Based Graphical Model Approach for Reinforcement Learning »
Tameem Adel · Adrian Weller -
2019 Oral: TibGM: A Transferable and Information-Based Graphical Model Approach for Reinforcement Learning »
Tameem Adel · Adrian Weller -
2018 Poster: Blind Justice: Fairness with Encrypted Sensitive Attributes »
Niki Kilbertus · Adria Gascon · Matt Kusner · Michael Veale · Krishna Gummadi · Adrian Weller -
2018 Oral: Blind Justice: Fairness with Encrypted Sensitive Attributes »
Niki Kilbertus · Adria Gascon · Matt Kusner · Michael Veale · Krishna Gummadi · Adrian Weller -
2018 Poster: Bucket Renormalization for Approximate Inference »
Sungsoo Ahn · Michael Chertkov · Adrian Weller · Jinwoo Shin -
2018 Oral: Bucket Renormalization for Approximate Inference »
Sungsoo Ahn · Michael Chertkov · Adrian Weller · Jinwoo Shin -
2018 Poster: Structured Evolution with Compact Architectures for Scalable Policy Optimization »
Krzysztof Choromanski · Mark Rowland · Vikas Sindhwani · Richard E Turner · Adrian Weller -
2018 Poster: Discovering Interpretable Representations for Both Deep Generative and Discriminative Models »
Tameem Adel · Zoubin Ghahramani · Adrian Weller -
2018 Oral: Discovering Interpretable Representations for Both Deep Generative and Discriminative Models »
Tameem Adel · Zoubin Ghahramani · Adrian Weller -
2018 Oral: Structured Evolution with Compact Architectures for Scalable Policy Optimization »
Krzysztof Choromanski · Mark Rowland · Vikas Sindhwani · Richard E Turner · Adrian Weller -
2017 Workshop: Reliable Machine Learning in the Wild »
Dylan Hadfield-Menell · Jacob Steinhardt · Adrian Weller · Smitha Milli -
2017 : A. Weller, "Challenges for Transparency" »
Adrian Weller -
2017 Workshop: Workshop on Human Interpretability in Machine Learning (WHI) »
Kush Varshney · Adrian Weller · Been Kim · Dmitry Malioutov -
2017 Poster: Lost Relatives of the Gumbel Trick »
Matej Balog · Nilesh Tripuraneni · Zoubin Ghahramani · Adrian Weller -
2017 Talk: Lost Relatives of the Gumbel Trick »
Matej Balog · Nilesh Tripuraneni · Zoubin Ghahramani · Adrian Weller