Timezone: »
Large language models are shown to present privacy risks through memorization of training data, and several recent works have studied such risks for the pre-training phase. Little attention, however, has been given to the fine-tuning phase and it is not well understood how different fine-tuning methods (such as fine-tuning the full model, the model head, and adapter) compare in terms of memorization risk. This presents increasing concern as the ``pre-train and fine-tune'' paradigm proliferates. In this paper, we empirically study memorization of fine-tuning methods using membership inference and extraction attacks, and show that their susceptibility to attacks is very different. We observe that fine-tuning the head of the model has the highest susceptibility to attacks, whereas fine-tuning smaller adapters appears to be less vulnerable to known extraction attacks.
Author Information
FatemehSadat Mireshghallah (University of California San Diego)
FatemehSadat Mireshghallah (University of California San Diego)
Archit Uniyal (Panjab University, Chandigarh, India)
Archit Uniyal (Panjab University, Chandigarh, India)
Tianhao Wang (University of Virginia, Charlottesville)
Tianhao Wang (University of Virginia, Charlottesville)
David Evans (University of Virginia)
David Evans (University of Virginia)
Taylor Berg-Kirkpatrick (University of California San Diego)
Taylor Berg-Kirkpatrick (University of California San Diego)
More from the Same Authors
-
2021 : DP-SGD vs PATE: Which Has Less Disparate Impact on Model Accuracy? »
Archit Uniyal · Rakshit Naidu · Sasikanth Kotti · Patrik Joslin Kenfack · Sahib Singh · FatemehSadat Mireshghallah -
2021 : Benchmarking Differential Privacy and Federated Learning for BERT Models »
Priyam Basu · Rakshit Naidu · Zumrut Muftuoglu · Sahib Singh · FatemehSadat Mireshghallah -
2021 : Formalizing Distribution Inference Risks »
Anshuman Suri · Anshuman Suri · David Evans -
2023 : When Can Linear Learners be Robust to Indiscriminate Poisoning Attacks? »
Fnu Suya · Xiao Zhang · Yuan Tian · David Evans -
2023 : Talk »
FatemehSadat Mireshghallah -
2023 Workshop: Generative AI and Law (GenLaw) »
Katherine Lee · A. Feder Cooper · FatemehSadat Mireshghallah · Madiha Zahrah · James Grimmelmann · David Mimno · Deep Ganguli · Ludwig Schubert -
2021 Poster: Model-Targeted Poisoning Attacks with Provable Convergence »
Fnu Suya · Saeed Mahloujifar · Anshuman Suri · David Evans · Yuan Tian -
2021 Spotlight: Model-Targeted Poisoning Attacks with Provable Convergence »
Fnu Suya · Saeed Mahloujifar · Anshuman Suri · David Evans · Yuan Tian -
2020 Poster: Divide and Conquer: Leveraging Intermediate Feature Representations for Quantized Training of Neural Networks »
Ahmed T. Elthakeb · Prannoy Pilligundla · FatemehSadat Mireshghallah · Alexander Cloninger · Hadi Esmaeilzadeh -
2020 Poster: Learning Adversarially Robust Representations via Worst-Case Mutual Information Maximization »
Sicheng Zhu · Xiao Zhang · David Evans -
2019 Workshop: Workshop on the Security and Privacy of Machine Learning »
Nicolas Papernot · Florian Tramer · Bo Li · Dan Boneh · David Evans · Somesh Jha · Percy Liang · Patrick McDaniel · Jacob Steinhardt · Dawn Song