Timezone: »
Modern machine learning increasingly requires training on a large collection of data from multiple sources, not all of which can be trusted. A particularly frightening scenario is when a small fraction of corrupted data changes the behavior of the trained model when triggered by an attacker-specified watermark. Such a compromised model will be deployed unnoticed as the model is accurate otherwise. There has been promising attempts to use the intermediate representations of such a model to separate corrupted examples from clean ones. However, these methods require a significant fraction of the data to be corrupted, in order to have strong enough signal for detection. We propose a novel defense algorithm using robust covariance estimation to amplify the spectral signature of corrupted data. This defense is able to completely remove backdoors whenever the benchmark backdoor attacks are successful, even in regimes where previous methods have no hope for detecting poisoned examples.
Author Information
Jonathan Hayase (University of Washington)
Weihao Kong (University of Washington)
Raghav Somani (University of Washington)
I am broadly interested in the aspects of Large-Scale Optimization and Probability theory that arise in fundamental Machine Learning.
Sewoong Oh (University of Washington)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Poster: Defense against backdoor attacks via robust covariance estimation »
Thu. Jul 22nd 04:00 -- 06:00 PM Room
More from the Same Authors
-
2021 : Robust and Differentially Private Covariance Estimation »
Logan Gnanapragasam · Jonathan Hayase · Sewoong Oh -
2021 : Estimating Optimal Policy Value in Linear Contextual Bandits beyond Gaussianity »
Jonathan Lee · Weihao Kong · Aldo Pacchiano · Vidya Muthukumar · Emma Brunskill -
2023 Poster: Efficient List-Decodable Regression using Batches »
Abhimanyu Das · Ayush Jain · Weihao Kong · Rajat Sen -
2022 Poster: MAML and ANIL Provably Learn Representations »
Liam Collins · Aryan Mokhtari · Sewoong Oh · Sanjay Shakkottai -
2022 Spotlight: MAML and ANIL Provably Learn Representations »
Liam Collins · Aryan Mokhtari · Sewoong Oh · Sanjay Shakkottai -
2022 Poster: De novo mass spectrometry peptide sequencing with a transformer model »
Melih Yilmaz · William Fondrie · Wout Bittremieux · Sewoong Oh · William Noble -
2022 Spotlight: De novo mass spectrometry peptide sequencing with a transformer model »
Melih Yilmaz · William Fondrie · Wout Bittremieux · Sewoong Oh · William Noble -
2021 Poster: KO codes: inventing nonlinear encoding and decoding for reliable wireless communication via deep-learning »
Ashok Vardhan Makkuva · Xiyang Liu · Mohammad Vahid Jamali · Hessam Mahdavifar · Sewoong Oh · Pramod Viswanath -
2021 Spotlight: KO codes: inventing nonlinear encoding and decoding for reliable wireless communication via deep-learning »
Ashok Vardhan Makkuva · Xiyang Liu · Mohammad Vahid Jamali · Hessam Mahdavifar · Sewoong Oh · Pramod Viswanath -
2020 Poster: Soft Threshold Weight Reparameterization for Learnable Sparsity »
Aditya Kusupati · Vivek Ramanujan · Raghav Somani · Mitchell Wortsman · Prateek Jain · Sham Kakade · Ali Farhadi -
2020 Poster: Optimal transport mapping via input convex neural networks »
Ashok Vardhan Makkuva · Amirhossein Taghvaei · Sewoong Oh · Jason Lee -
2020 Poster: InfoGAN-CR and ModelCentrality: Self-supervised Model Training and Selection for Disentangling GANs »
Zinan Lin · Kiran Thekumparampil · Giulia Fanti · Sewoong Oh -
2020 Poster: Meta-learning for Mixed Linear Regression »
Weihao Kong · Raghav Somani · Zhao Song · Sham Kakade · Sewoong Oh