Poster
in
Workshop: ICML 2024 Workshop on Foundation Models in the Wild
Finding NeMo: Localizing Neurons Responsible For Memorization in Diffusion Models
Lukas Struppek · Dominik Hintersdorf · Kristian Kersting · Adam Dziedzic · Franziska Boenisch
Keywords: [ Diffusion Models ] [ memorization ]
Diffusion models (DMs) produce very detailed and high-quality images, achieved through rigorous training on huge datasets.Unfortunately, this practice raises privacy and intellectual property concerns, as DMs can memorize and later reproduce their potentially sensitive or copyrighted training images at inference time. Prior efforts to prevent this issue are viable when the DM is developed and deployed in a secure and constantly monitored environment.However, they hold the risk of adversaries circumventing the safeguards and are not effective when the DM itself is publicly released. To solve the problem, we introduce NeMo, the first method to localize memorization of individual data samples down to the level of neurons in DMs' cross-attention layers. Through our experiments, we make the intriguing finding that in many cases, single neurons are responsible for memorizing particular training samples. By deactivating these memorization neurons, we avoid replication of training data at inference time, increase the diversity in the generated outputs, and mitigate the leakage of sensitive data.