Skip to yearly menu bar Skip to main content


Poster

Localizing Task Information for Improved Model Merging and Compression

Ke Wang · Nikolaos Dimitriadis · Guillermo Ortiz-Jimenez · François Fleuret · Pascal Frossard


Abstract: Model merging and task arithmetic have emerged as promising scalable approaches to merge multiple single-task checkpoints to one multi-task model, but their applicability is reduced by significant performance loss. Previous works have linked these drops to interference in the weight space and erasure of important task-specific features. Instead, we show that information required to solve each task is still preserved post merging as different tasks mostly require non-overlapping sets of weights. In this work, we propose a new method to identify these task supports given a collection of task vectors and show that one can retrieve $>99$\% of the single task accuracy by applying our masks to the multi-task vector, effectively compressing the individual checkpoints. We study the statistics of intersections among constructed masks and reveal the existence of \textit{selfish} and \textit{catastrophic} weights, i.e., parameters important exclusively to one task and irrelevant to all but detrimental to multi-task fusion. For this reason, we propose Consensus Merging, an algorithm that eliminates such weights and improves general performance of existing model merging approaches. Our experiments in vision and NLP benchmarks with up to 20 tasks, show that Consensus Merging consistently improves existing approaches, while our compression scheme reduces storage from 57Gb to 8.2Gb while retaining 99.7\% of original performance.

Live content is unavailable. Log in and register to view live content