Poster
in
Workshop: ES-FoMo: Efficient Systems for Foundation Models
ROSA: Random Orthogonal Subspace Adaptation
Marawan Gamal · Guillaume Rabusseau
Model training requires significantly more memory, compared with inference.Parameter efficient fine-tuning (PEFT) methods provide a means of adapting large models to downstream tasks using less memory. However, existing methods either introduce latency overhead at inference time or achieve subpar downstream performance compared with full fine-tuning.In this work we propose Random Orthogonal Subspace Adaptation (ROSA), a method that exceeds the performance of previous PEFT methods by a significant margin, while maintaining a zero latency overhead during inference time. In contrast to previous methods, ROSA is able to adapt subspaces of larger size, without consuming additional memory during runtime. As PEFT methods are especially useful in the natural language processing domain. We evaluate ROSA by finetuning GPT2 on various Natural Language Generation (NLG) tasks. We will to make our code publicly available upon acceptance.