Fine-Tuning of Transformer models with Frames
Harshavardhan Adepu ⋅ Li Zhang ⋅ Sanjiv Kumar ⋅ Vikas Singh
Abstract
Parameter-Efficient Fine-Tuning (PEFT) strategies such as Low-Rank Adaptation (LoRA) are effective solutions for fine-tuning large-scale pre-trained models; however, their memory requirements scales with the size of the model, $\mathcal{O}(dr)$, where $d$ is the model's hidden dimension and $r$ is the rank. Our proposal, FrameFT, models the parameter update $\Delta W$ with a sparse coefficient matrix in a Fusion Frame basis. Fusion Frames can be generated algorithmically and shared across model layers, enabling highly efficient updates. Only the sparse coefficients of the basis expansion are stored/optimized, strongly reducing the memory footprint and parameter count. The sparse structure of the coefficient matrix in FrameFT and the sparsity in the Fusion Frames, give sizable compute benefits. Our technical analysis shows that FrameFT allows obtaining formal convergence results. We evaluate our method across a suite of supervised fine-tuning benchmarks, primarily focusing on language tasks, but also report applicability to vision models. Our empirical evaluations show that FrameFT achieves performance on par with or exceeding state-of-the-art PEFT techniques, but needs far fewer trainable parameters and less memory.
Successful Page Load