Timezone: »
Data tracing determines whether a particular image dataset has been used to train a model. We propose a new technique, radioactive data, that makes imperceptible changes to this dataset such that any model trained on it will bear an identifiable mark. Given a trained model, our technique detects the use of radioactive data and provides a level of confidence (p-value). Experiments on large-scale benchmarks (Imagenet), with standard architectures (Resnet-18, VGG-16, Densenet-121) and training procedures, show that we detect radioactive data with high confidence (p<0.0001) when only 1% of the data used to trained a model is radioactive. Our radioactive mark is resilient to strong data augmentations and variations of the model architecture. As a result, it offers a much higher signal-to-noise ratio than data poisoning and backdoor methods.
Author Information
Alexandre Sablayrolles (Facebook AI)
Douze Matthijs (Facebook AI Research)
Cordelia Schmid (Inria/Google)
Herve Jegou (Facebook AI Research)
More from the Same Authors
-
2023 Poster: Analyzing Privacy Leakage in Machine Learning via Multiple Hypothesis Testing: A Lesson From Fano »
Chuan Guo · Alexandre Sablayrolles · Maziar Sanjabi -
2023 Poster: TAN Without a Burn: Scaling Laws of DP-SGD »
Tom Sander · Pierre Stock · Alexandre Sablayrolles -
2021 Poster: Training data-efficient image transformers & distillation through attention »
Hugo Touvron · Matthieu Cord · Douze Matthijs · Francisco Massa · Alexandre Sablayrolles · Herve Jegou -
2021 Spotlight: Training data-efficient image transformers & distillation through attention »
Hugo Touvron · Matthieu Cord · Douze Matthijs · Francisco Massa · Alexandre Sablayrolles · Herve Jegou -
2021 Poster: Goal-Conditioned Reinforcement Learning with Imagined Subgoals »
Elliot Chane-Sane · Cordelia Schmid · Ivan Laptev -
2021 Spotlight: Goal-Conditioned Reinforcement Learning with Imagined Subgoals »
Elliot Chane-Sane · Cordelia Schmid · Ivan Laptev -
2019 Poster: White-box vs Black-box: Bayes Optimal Strategies for Membership Inference »
Alexandre Sablayrolles · Douze Matthijs · Cordelia Schmid · Yann Ollivier · Herve Jegou -
2019 Oral: White-box vs Black-box: Bayes Optimal Strategies for Membership Inference »
Alexandre Sablayrolles · Douze Matthijs · Cordelia Schmid · Yann Ollivier · Herve Jegou -
2017 Poster: Efficient softmax approximation for GPUs »
Edouard Grave · Armand Joulin · Moustapha Cisse · David Grangier · Herve Jegou -
2017 Talk: Efficient softmax approximation for GPUs »
Edouard Grave · Armand Joulin · Moustapha Cisse · David Grangier · Herve Jegou