Poster
 in 
Workshop: ES-FoMo III: 3rd Workshop on Efficient Systems for Foundation Models
                        
                    
                    An Efficient Row-Based Sparse Fine-Tuning with Low Quantization Error
Cen-Jhih Li · Aditya Bhaskara
                        Abstract:
                        
                            
                    
                Fine-tuning is essential for adapting large language models to downstream tasks, but can be costly for users with limited resources. To address this, Sparse Fine-tuning (SpFT) and Low-rank adaptation (LoRA) have been widely adopted for efficient fine-tuning. In this work, we propose a new SpFT framework inspired by neural network pruning: we identify important neurons using structural pruning and fine-tune only the associated weights. Experiments on common language tasks show our method improves SpFT’s memory efficiency by 20–50\% while matching the accuracy of state-of-the-art methods like LoRA's variants.
Chat is not available.
            
        Successful Page Load