ICML Seeded LoRA: Collaborative Fine-Tuning Through Seed Initialization of Adapters

Poster
in
Workshop: ES-FoMo II: 2nd Workshop on Efficient Systems for Foundation Models

Seeded LoRA: Collaborative Fine-Tuning Through Seed Initialization of Adapters

Alejandro Rodriguez Salamanca · Ahmet Üstün · Nicki Skafte Detlefsen · Tim Dettmers

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract:

Parameter-Efficient Fine-Tuning (PEFT) methods enable cost-effective adaptation of pretrained language models to specific tasks and domains. Collaborative Fine-Tuning (CoFT) seeks to merge these specialized models into a single model, often a routed Mixture-of-Expert (MoE) model, to achieve better generalization across domains and tasks. However, current CoFT models require a post-merge fine-tuning stage, making these approaches inaccessible to users lacking fine-tuning expertise. We introduce Seeded LoRA, a novel CoFT approach that does not require post-merge fine-tuning, enabling plug-and-play PEFT adapter merging. Seeded LoRA outperforms LoRA and MoE LoRA (MoLoRA) approaches by an average of 7 percentage points across 16 zero-shot tasks. Seeded LoRA works by initializing a model with a generic seed expert low-rank adapter, ensuring subsequent fine-tuning runs are in the same optimization space, exhibiting linear mode connectivity. This process allows integrating independently fine-tuned models into a single model using a static, untrained soft uniform probability router. We show that this formulation is equivalent to grouped convolution or multi-head processing, explaining its effectiveness. Additionally, we highlight that Seeded LoRA alleviates most routing failures in post-merge fine-tuning, making it a suitable base method for future routed CoFT approaches.

Chat is not available.

Poster in Workshop: ES-FoMo II: 2nd Workshop on Efficient Systems for Foundation Models

Seeded LoRA: Collaborative Fine-Tuning Through Seed Initialization of Adapters

Alejandro Rodriguez Salamanca · Ahmet Üstün · Nicki Skafte Detlefsen · Tim Dettmers

Poster
in
Workshop: ES-FoMo II: 2nd Workshop on Efficient Systems for Foundation Models