FairMerging: Rethinking Model Merging through the Lens of Fairness
Abstract
Model merging offers an appealing route to multi-task learning by composing independently fine-tuned checkpoints without centralized data or retraining. However, this convenience can come with a hidden cost. Model merging may amplify performance disparities across subgroups, raising fairness concerns even when average accuracy remains competitive. To explain this phenomenon, we develop a sensitivity-based theoretical analysis that upper bounds the fairness gap induced by model merging. Theoretical analysis with empirical verifications reveals that the resulting fairness gap is governed by two coupled factors, a merging magnitude term that measures how far the merged parameters move from the target model and global sensitivity terms that determine how unevenly the perturbation affects subgroup losses. Guided by these insights, we propose FairMerging, a two-stage merging framework that first reduces the sensitivity of the target model and then performs fairness-aware coefficient optimization with orthogonally normalized task vectors. Experiments across multiple datasets, backbones, and merging baselines demonstrate that FairMerging substantially mitigates unfairness while retaining competitive multi-task performance.