Timezone: »
Deep neural networks (DNNs) have been found to be vulnerable to backdoor attacks, raising security concerns about their deployment in mission-critical applications. While existing defense methods have demonstrated promising results, it is still not clear how to effectively remove backdoor-associated neurons in backdoored DNNs. In this paper, we propose a novel defense called Reconstructive Neuron Pruning (RNP) to expose and prune backdoor neurons via an unlearning and then recovering process. Specifically, RNP first unlearns the neurons by maximizing the model's error on a small subset of clean samples and then recovers the neurons by minimizing the model's error on the same data. In RNP, unlearning is operated at the neuron level while recovering is operated at the filter level, forming an asymmetric reconstructive learning procedure. We show that such an asymmetric process on only a few clean samples can effectively expose and prune the backdoor neurons implanted by a wide range of attacks, achieving a new state-of-the-art defense performance. Moreover, the unlearned model at the intermediate step of our RNP can be directly used to improve other backdoor defense tasks including backdoor removal, trigger recovery, backdoor label detection, and backdoor sample detection. Code is available at https://github.com/bboylyg/RNP.
Author Information
Yige Li (Xidian University)
XIXIANG LYU (Xidian University)
Xingjun Ma (Deakin University)
Nodens Koren (The University of Melbourne)
Lingjuan Lyu (Sony Research)
Bo Li (UIUC)

Dr. Bo Li is an assistant professor in the Department of Computer Science at the University of Illinois at Urbana–Champaign. She is the recipient of the IJCAI Computers and Thought Award, Alfred P. Sloan Research Fellowship, AI’s 10 to Watch, NSF CAREER Award, MIT Technology Review TR-35 Award, Dean's Award for Excellence in Research, C.W. Gear Outstanding Junior Faculty Award, Intel Rising Star award, Symantec Research Labs Fellowship, Rising Star Award, Research Awards from Tech companies such as Amazon, Facebook, Intel, IBM, and eBay, and best paper awards at several top machine learning and security conferences. Her research focuses on both theoretical and practical aspects of trustworthy machine learning, which is at the intersection of machine learning, security, privacy, and game theory. She has designed several scalable frameworks for trustworthy machine learning and privacy-preserving data publishing. Her work has been featured by major publications and media outlets such as Nature, Wired, Fortune, and New York Times.
Yu-Gang Jiang (Fudan University)
More from the Same Authors
-
2021 : Adversarial Interaction Attacks: Fooling AI to Misinterpret Human Intentions »
Nodens Koren · Xingjun Ma · Qiuhong Ke · Yisen Wang · James Bailey -
2022 : Group Distributionally Robust Reinforcement Learning with Hierarchical Latent Variables »
Mengdi Xu · Peide Huang · Visak Kumar · Jielin Qiu · Chao Fang · Kuan-Hui Lee · Xuewei Qi · Henry Lam · Bo Li · Ding Zhao -
2022 : Paper 10: CausalAF: Causal Autoregressive Flow for Safety-Critical Scenes Generation »
Wenhao Ding · Haohong Lin · Bo Li · Ding Zhao · Hitesh Arora -
2023 : DiffScene: Diffusion-Based Safety-Critical Scenario Generation for Autonomous Vehicles »
Chejian Xu · Ding Zhao · Alberto Sngiovanni Vincentelli · Bo Li -
2023 : Semantically Adversarial Scene Generation with Explicit Knowledge Guidance for Autonomous Driving »
Wenhao Ding · Haohong Lin · Bo Li · Ding Zhao -
2023 : Can Public Large Language Models Help Private Cross-device Federated Learning? »
Boxin Wang · Yibo J. Zhang · Yuan Cao · Bo Li · Hugh B McMahan · Sewoong Oh · Zheng Xu · Manzil Zaheer -
2023 : Can Public Large Language Models Help Private Cross-device Federated Learning? »
Boxin Wang · Yibo J. Zhang · Yuan Cao · Bo Li · Hugh B McMahan · Sewoong Oh · Zheng Xu · Manzil Zaheer -
2023 : Visual-based Policy Learning with Latent Language Encoding »
Jielin Qiu · Mengdi Xu · William Han · Bo Li · Ding Zhao -
2023 : Can Brain Signals Reveal Inner Alignment with Human Languages? »
Jielin Qiu · William Han · Jiacheng Zhu · Mengdi Xu · Douglas Weber · Bo Li · Ding Zhao -
2023 Workshop: Federated Learning and Analytics in Practice: Algorithms, Systems, Applications, and Opportunities »
Zheng Xu · Peter Kairouz · Bo Li · Tian Li · John Nguyen · Jianyu Wang · Shiqiang Wang · Ayfer Ozgur -
2023 Workshop: Knowledge and Logical Reasoning in the Era of Data-driven Learning »
Nezihe Merve Gürel · Bo Li · Theodoros Rekatsinas · Beliz Gunel · Alberto Sngiovanni Vincentelli · Paroma Varma -
2023 Poster: Revisiting Data-Free Knowledge Distillation with Poisoned Teachers »
Junyuan Hong · Yi Zeng · Shuyang Yu · Lingjuan Lyu · Ruoxi Jia · Jiayu Zhou -
2023 Poster: UMD: Unsupervised Model Detection for X2X Backdoor Attacks »
Zhen Xiang · Zidi Xiong · Bo Li -
2023 Poster: Byzantine-Robust Learning on Heterogeneous Data via Gradient Splitting »
Yuchen Liu · Chen Chen · Lingjuan Lyu · Fangzhao Wu · Sai Wu · Gang Chen -
2023 Poster: Dimension-independent Certified Neural Network Watermarks via Mollifier Smoothing »
Jiaxiang Ren · Yang Zhou · Jiayin Jin · Lingjuan Lyu · Da Yan -
2023 Poster: Interpolation for Robust Learning: Data Augmentation on Wasserstein Geodesics »
Jiacheng Zhu · Jielin Qiu · Aritra Guha · Zhuolin Yang · XuanLong Nguyen · Bo Li · Ding Zhao -
2023 Poster: Open-VCLIP: Transforming CLIP to an Open-vocabulary Video Model via Interpolated Weight Optimization »
Zejia Weng · Xitong Yang · Ang Li · Zuxuan Wu · Yu-Gang Jiang -
2023 Poster: Fast Federated Machine Unlearning with Nonlinear Functional Theory »
Tianshi Che · Yang Zhou · Zijie Zhang · Lingjuan Lyu · Ji Liu · Da Yan · Dejing Dou · Jun Huan -
2022 : Paper 15: On the Robustness of Safe Reinforcement Learning under Observational Perturbations »
Zuxin Liu · Zhepeng Cen · Huan Zhang · Jie Tan · Bo Li · Ding Zhao -
2022 Poster: Constrained Variational Policy Optimization for Safe Reinforcement Learning »
Zuxin Liu · Zhepeng Cen · Vladislav Isenbaev · Wei Liu · Steven Wu · Bo Li · Ding Zhao -
2022 Poster: Provable Domain Generalization via Invariant-Feature Subspace Recovery »
Haoxiang Wang · Haozhe Si · Bo Li · Han Zhao -
2022 Spotlight: Constrained Variational Policy Optimization for Safe Reinforcement Learning »
Zuxin Liu · Zhepeng Cen · Vladislav Isenbaev · Wei Liu · Steven Wu · Bo Li · Ding Zhao -
2022 Spotlight: Provable Domain Generalization via Invariant-Feature Subspace Recovery »
Haoxiang Wang · Haozhe Si · Bo Li · Han Zhao -
2022 Poster: How to Steer Your Adversary: Targeted and Efficient Model Stealing Defenses with Gradient Redirection »
Mantas Mazeika · Bo Li · David Forsyth -
2022 Poster: Adversarially Robust Models may not Transfer Better: Sufficient Conditions for Domain Transferability from the View of Regularization »
Xiaojun Xu · Yibo Zhang · Evelyn Ma · Hyun Ho Son · Sanmi Koyejo · Bo Li -
2022 Poster: Understanding Gradual Domain Adaptation: Improved Analysis, Optimal Path and Beyond »
Haoxiang Wang · Bo Li · Han Zhao -
2022 Spotlight: How to Steer Your Adversary: Targeted and Efficient Model Stealing Defenses with Gradient Redirection »
Mantas Mazeika · Bo Li · David Forsyth -
2022 Spotlight: Adversarially Robust Models may not Transfer Better: Sufficient Conditions for Domain Transferability from the View of Regularization »
Xiaojun Xu · Yibo Zhang · Evelyn Ma · Hyun Ho Son · Sanmi Koyejo · Bo Li -
2022 Spotlight: Understanding Gradual Domain Adaptation: Improved Analysis, Optimal Path and Beyond »
Haoxiang Wang · Bo Li · Han Zhao -
2022 Poster: Certifying Out-of-Domain Generalization for Blackbox Functions »
Maurice Weber · Linyi Li · Boxin Wang · Zhikuan Zhao · Bo Li · Ce Zhang -
2022 Poster: Double Sampling Randomized Smoothing »
Linyi Li · Jiawei Zhang · Tao Xie · Bo Li -
2022 Poster: TPC: Transformation-Specific Smoothing for Point Cloud Models »
Wenda Chu · Linyi Li · Bo Li -
2022 Spotlight: TPC: Transformation-Specific Smoothing for Point Cloud Models »
Wenda Chu · Linyi Li · Bo Li -
2022 Spotlight: Double Sampling Randomized Smoothing »
Linyi Li · Jiawei Zhang · Tao Xie · Bo Li -
2022 Spotlight: Certifying Out-of-Domain Generalization for Blackbox Functions »
Maurice Weber · Linyi Li · Boxin Wang · Zhikuan Zhao · Bo Li · Ce Zhang -
2021 : Discussion Panel #2 »
Bo Li · Nicholas Carlini · Andrzej Banburski · Kamalika Chaudhuri · Will Xiao · Cihang Xie -
2021 Workshop: A Blessing in Disguise: The Prospects and Perils of Adversarial Machine Learning »
Hang Su · Yinpeng Dong · Tianyu Pang · Eric Wong · Zico Kolter · Shuo Feng · Bo Li · Henry Liu · Dan Hendrycks · Francesco Croce · Leslie Rice · Tian Tian -
2021 Poster: Uncovering the Connections Between Adversarial Transferability and Knowledge Transferability »
Kaizhao Liang · Yibo Zhang · Boxin Wang · Zhuolin Yang · Sanmi Koyejo · Bo Li -
2021 Poster: CRFL: Certifiably Robust Federated Learning against Backdoor Attacks »
Chulin Xie · Minghao Chen · Pin-Yu Chen · Bo Li -
2021 Poster: Progressive-Scale Boundary Blackbox Attack via Projective Gradient Estimation »
Jiawei Zhang · Linyi Li · Huichen Li · Xiaolu Zhang · Shuang Yang · Bo Li -
2021 Poster: Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Training and Effective Adaptation »
Haoxiang Wang · Han Zhao · Bo Li -
2021 Spotlight: Progressive-Scale Boundary Blackbox Attack via Projective Gradient Estimation »
Jiawei Zhang · Linyi Li · Huichen Li · Xiaolu Zhang · Shuang Yang · Bo Li -
2021 Spotlight: Uncovering the Connections Between Adversarial Transferability and Knowledge Transferability »
Kaizhao Liang · Yibo Zhang · Boxin Wang · Zhuolin Yang · Sanmi Koyejo · Bo Li -
2021 Spotlight: Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Training and Effective Adaptation »
Haoxiang Wang · Han Zhao · Bo Li -
2021 Spotlight: CRFL: Certifiably Robust Federated Learning against Backdoor Attacks »
Chulin Xie · Minghao Chen · Pin-Yu Chen · Bo Li -
2021 Poster: Knowledge Enhanced Machine Learning Pipeline against Diverse Adversarial Attacks »
Nezihe Merve Gürel · Xiangyu Qi · Luka Rimanic · Ce Zhang · Bo Li -
2021 Spotlight: Knowledge Enhanced Machine Learning Pipeline against Diverse Adversarial Attacks »
Nezihe Merve Gürel · Xiangyu Qi · Luka Rimanic · Ce Zhang · Bo Li -
2020 Poster: Improving Robustness of Deep-Learning-Based Image Reconstruction »
Ankit Raj · Yoram Bresler · Bo Li