ICML Model Breadcrumbs: Scalable Upcycling of Finetuned Foundation Models via Sparse Task Vectors Merging

Poster
in
Workshop: ICML 2024 Workshop on Foundation Models in the Wild

Model Breadcrumbs: Scalable Upcycling of Finetuned Foundation Models via Sparse Task Vectors Merging

MohammadReza Davari · Eugene Belilovsky

Keywords: [ foundation models ] [ model merging ] [ transfer learning ]

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract:

The rapid development of AI systems has been greatly influenced by foundation models. Typically, these models are fine-tuned for specific tasks, leading to numerous task-specific versions. This paper addresses the challenge of merging and upcycling these fine-tuned models. We introduce Model Breadcrumbs, a simple method using sparse weight trajectories to guide model adaptation within a pre-trained model's weight space. Our approach improves performance across multiple tasks without the need for hyperparameter tuning for each new task. Extensive experiments, involving various models, tasks, and modalities, demonstrate that Model Breadcrumbs provides an efficient and effective solution for creating and updating multi-task models, promoting a community-driven effort for updatable machine learning.

Chat is not available.

Poster in Workshop: ICML 2024 Workshop on Foundation Models in the Wild

Model Breadcrumbs: Scalable Upcycling of Finetuned Foundation Models via Sparse Task Vectors Merging

MohammadReza Davari · Eugene Belilovsky

Poster
in
Workshop: ICML 2024 Workshop on Foundation Models in the Wild