Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Challenges in Deployable Generative AI

The Journey, Not the Destination: How Data Guides Diffusion Models

Kristian Georgiev · Joshua Vendrow · Hadi Salman · Sung Min (Sam) Park · Aleksander Madry

Keywords: [ data attribution ] [ Diffusion Models ] [ memorization ] [ Privacy ] [ Data Valuation ] [ influence estimation ]


Abstract:

Diffusion-based generative models can synthesize photo-realistic images of unprecedented quality and diversity. However, attributing these images back to the training data---that is, identifying specific training examples which caused the images to be generated---remains challenging. In this paper, we propose a framework that: i) formalizes data attribution in the context of diffusion models, and ii) provides a method for computing attributions efficiently. By applying our framework to CIFAR-10 and MS COCO, we uncover visually compelling attributions, which we validate through counterfactual analysis.

Chat is not available.