Timezone: »
Past work has shown that large language models are susceptible to privacy attacks, where adversaries generate sequences from a trained model and detect which sequences are memorized from the training set. In this work, we show that the success of these attacks is largely due to duplication in commonly used web-scraped training sets. We first show that the rate at which language models regenerate training sequences is superlinearly related to a sequence's count in the training set. For instance, a sequence that is present 10 times in the training data is on average generated 1000x more often than a sequence that is present only once. We next show that existing methods for detecting memorized sequences have near-chance accuracy on non-duplicated training sequences. Finally, we find that after applying methods to deduplicate training data, language models are considerably more secure against these types of privacy attacks. Taken together, our results motivate an increased focus on deduplication in privacy-sensitive applications and a reevaluation of the practicality of existing privacy attacks.
Author Information
Nikhil Kandpal (University of North Carolina, Chapel Hill)
Eric Wallace (U.C. Berkeley)
Colin Raffel (Google Brain)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Poster: Deduplicating Training Data Mitigates Privacy Risks in Language Models »
Wed. Jul 20th through Thu the 21st Room Hall E #1025
More from the Same Authors
-
2023 Poster: Git-Theta: A Git Extension for Collaborative Development of Machine Learning Models »
Nikhil Kandpal · Brian Lester · Mohammed Muqeeth · Anisha Mascarenhas · Monty Evans · Vishal Baskaran · Tenghao Huang · Haokun Liu · Colin Raffel -
2023 Poster: Poisoning Language Models During Instruction Tuning »
Alexander Wan · Eric Wallace · Sheng Shen · Dan Klein -
2023 Poster: Large Language Models Struggle to Learn Long-Tail Knowledge »
Nikhil Kandpal · Haikang Deng · Adam Roberts · Eric Wallace · Colin Raffel -
2022 Workshop: The First Workshop on Pre-training: Perspectives, Pitfalls, and Paths Forward »
Huaxiu Yao · Hugo Larochelle · Percy Liang · Colin Raffel · Jian Tang · Ying WEI · Saining Xie · Eric Xing · Chelsea Finn -
2022 Poster: What Language Model Architecture and Pretraining Objective Works Best for Zero-Shot Generalization? »
Thomas Wang · Adam Roberts · Daniel Hesslow · Teven Le Scao · Hyung Won Chung · Iz Beltagy · Julien Launay · Colin Raffel -
2022 Spotlight: What Language Model Architecture and Pretraining Objective Works Best for Zero-Shot Generalization? »
Thomas Wang · Adam Roberts · Daniel Hesslow · Teven Le Scao · Hyung Won Chung · Iz Beltagy · Julien Launay · Colin Raffel -
2021 Poster: Calibrate Before Use: Improving Few-shot Performance of Language Models »
Tony Z. Zhao · Eric Wallace · Shi Feng · Dan Klein · Sameer Singh -
2021 Oral: Calibrate Before Use: Improving Few-shot Performance of Language Models »
Tony Z. Zhao · Eric Wallace · Shi Feng · Dan Klein · Sameer Singh -
2020 Poster: Train Big, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers »
Zhuohan Li · Eric Wallace · Sheng Shen · Kevin Lin · Kurt Keutzer · Dan Klein · Joseph Gonzalez -
2017 Poster: Online and Linear-Time Attention by Enforcing Monotonic Alignments »
Colin Raffel · Thang Luong · Peter Liu · Ron Weiss · Douglas Eck -
2017 Talk: Online and Linear-Time Attention by Enforcing Monotonic Alignments »
Colin Raffel · Thang Luong · Peter Liu · Ron Weiss · Douglas Eck