How Hard Can It Be? Hardness-Aware Multi-Objective Unlearning
Abstract
Machine unlearning aims to remove the influence of specific training samples due to privacy, copyright or bias concerns. Multi-objective unlearning seeks to ensure the effective forgetting of such samples while preserving the utility of the unlearned model. Existing multi-objective unlearning methods typically optimize a weighted combination of the objectives. They provide no guarantee that any of the objectives can achieve a required performance and do not consider the similarity between the forget data and the remaining retain data. In this work, we quantify how hard it is to reconcile the conflicting objectives arising from overlapping data and provide conditions under which collateral forgetting is unavoidable, that is, when improving forget quality forces retain utility degradation. Utilizing this hardness measure, we propose a hardness-aware multi-objective unlearning algorithm (HAMU) that adapts the unlearning updates based on the per-iteration hardness. Our algorithm is applicable to non-convex models and is easily parallelizable, making it readily deployable in real-world scenarios. We empirically demonstrate HAMU's superior performance over baselines on both image and text datasets using large-scale models.