A Unified Framework for Diffusion Model Unlearning with f-Divergence
Nicola Novello ⋅ Federico Fontana ⋅ Luigi Cinque ⋅ Deniz Gunduz ⋅ Andrea Tonello
Abstract
Most current methods for unlearning concepts in text-to-image diffusion models rely on mean squared error-based loss functions to align target distributions with anchors. In this paper, we generalize this idea into a unified $f$-divergence-based framework that recovers the standard mean squared error loss as a specific instance. By generalizing the loss function, we theoretically analyze and numerically validate how different $f$-divergences impact the gradient magnitude and the convergence properties of the algorithm, affecting the quality of unlearning. The proposed unified framework offers a flexible paradigm for selecting the optimal divergence based on the application and user goal, allowing for finer control over the trade-off between unlearning efficacy and generative fidelity.
Successful Page Load