Skip to yearly menu bar Skip to main content


Poster

Random Masking Finds Winning Tickets for Parameter Efficient Fine-tuning

Jing Xu · Jingzhao Zhang


Abstract:

Fine-tuning large language models (LLM) can be costly. Parameter-efficient fine-tuning (PEFT) addresses the problems by training a fraction of the parameters. Its success simultaneously reveals the expressiveness and flexibility of pretrained models. Our work studies the limit of PEFT by further simplifying its design and reducing the number of trainable parameters beyond standard setups. To this end, we use Random Masking to tune the pretrained model. Our experiments reveal the surprising empirical effectiveness of Random Masking as well as the great expressive power of pretrained LLMs. We demonstrate that with a larger-than-expected learning rate, Random Masking can achieve competitive performance using significantly fewer trainable parameters. We provide both empirical and theoretical explorations into the success of Random Masking. We show that masking induces a flatter loss landscape and more distant solutions, which allows for and necessitates large learning rates.

Live content is unavailable. Log in and register to view live content