Timezone: »

TMI! Finetuned Models Spill Secrets from Pretraining
John Abascal · Stanley Wu · Alina Oprea · Jonathan Ullman
Event URL: https://openreview.net/forum?id=B0mUr6QI8F »

Transfer learning has become an increasingly popular technique in machine learning as a way to leverage a pretrained model trained for related tasks. This paradigm has been especially popular for \emph{privacy preserving machine learning}, where the pretrained model is considered public, and only the data for finetuning is considered sensitive. However, there are reasons to believe that the data used for pretraining is still sensitive. In this work we study privacy leakage via membership-inference attacks, and we propose a new threat model where the adversary only has access to the finetuned model and would like to infer the membership of the pretraining data. To realize this threat model, we implement a novel metaclassifier-based attack, TMI. We evaluate TMI on both vision and natural language tasks across multiple transfer learning settings, including finetuning with differential privacy. Through our evaluation, we find that TMI can successfully infer membership of pretraining examples using query access to the finetuned model.

Author Information

John Abascal (Northeastern University)
Stanley Wu
Alina Oprea (Northeastern University)
Jonathan Ullman (Northeastern University)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors