Timezone: »
Poster
MultiRobustBench: Benchmarking Robustness Against Multiple Attacks
Sophie Dai · Saeed Mahloujifar · Chong Xiang · Vikash Sehwag · Pin-Yu Chen · Prateek Mittal
Event URL: https://multirobustbench.github.io »
The bulk of existing research in defending against adversarial examples focuses on defending against a single (typically bounded $\ell_p$-norm) attack, but for a practical setting, machine learning (ML) models should be robust to a wide variety of attacks. In this paper, we present the first unified framework for considering multiple attacks against ML models. Our framework is able to model different levels of learner's knowledge about the test-time adversary, allowing us to model robustness against unforeseen attacks and robustness against unions of attacks. Using our framework, we present the first leaderboard, MultiRobustBench (https://multirobustbench.github.io), for benchmarking multiattack evaluation which captures performance across attack types and attack strengths. We evaluate the performance of 16 defended models for robustness against a set of 9 different attack types, including $\ell_p$-based threat models, spatial transformations, and color changes, at 20 different attack strengths (180 attacks total). Additionally, we analyze the state of current defenses against multiple attacks. Our analysis shows that while existing defenses have made progress in terms of average robustness across the set of attacks used, robustness against the worst-case attack is still a big open problem as all existing models perform worse than random guessing.
The bulk of existing research in defending against adversarial examples focuses on defending against a single (typically bounded $\ell_p$-norm) attack, but for a practical setting, machine learning (ML) models should be robust to a wide variety of attacks. In this paper, we present the first unified framework for considering multiple attacks against ML models. Our framework is able to model different levels of learner's knowledge about the test-time adversary, allowing us to model robustness against unforeseen attacks and robustness against unions of attacks. Using our framework, we present the first leaderboard, MultiRobustBench (https://multirobustbench.github.io), for benchmarking multiattack evaluation which captures performance across attack types and attack strengths. We evaluate the performance of 16 defended models for robustness against a set of 9 different attack types, including $\ell_p$-based threat models, spatial transformations, and color changes, at 20 different attack strengths (180 attacks total). Additionally, we analyze the state of current defenses against multiple attacks. Our analysis shows that while existing defenses have made progress in terms of average robustness across the set of attacks used, robustness against the worst-case attack is still a big open problem as all existing models perform worse than random guessing.
Author Information
Sophie Dai (Princeton University)
Saeed Mahloujifar (Meta)
Chong Xiang (Princeton University)
Vikash Sehwag (Princeton University)
Pin-Yu Chen (IBM Research)
Prateek Mittal (Princeton University)
More from the Same Authors
-
2021 : Generalizing Adversarial Training to Composite Semantic Perturbations »
Yun-Yun Tsai · Lei Hsiung · Pin-Yu Chen · Tsung-Yi Ho -
2021 : On the Effectiveness of Poisoning against Unsupervised Domain Adaptation »
Akshay Mehra · Bhavya Kailkhura · Pin-Yu Chen · Jihun Hamm -
2023 : Teach GPT To Phish »
Ashwinee Panda · Zhengming Zhang · Yaoqing Yang · Prateek Mittal -
2023 : Characterizing the Optimal $0-1$ Loss for Multi-class Classification with a Test-time Attacker »
Sophie Dai · Wenxin Ding · Arjun Nitin Bhagoji · Daniel Cullina · Ben Zhao · Heather Zheng · Prateek Mittal -
2023 : Which Features are Learned by Contrastive Learning? On the Role of Simplicity Bias in Class Collapse and Feature Suppression »
Yihao Xue · Siddharth Joshi · Eric Gan · Pin-Yu Chen · Baharan Mirzasoleiman -
2023 : A Privacy-Friendly Approach to Data Valuation »
Jiachen Wang · Yuqing Zhu · Yu-Xiang Wang · Ruoxi Jia · Prateek Mittal -
2023 : On the Reproducibility of Data Valuation under Learning Stochasticity »
Jiachen Wang · Feiyang Kang · Chiyuan Zhang · Ruoxi Jia · Prateek Mittal -
2023 : Machine Learning with Feature Differential Privacy »
Saeed Mahloujifar · Chuan Guo · G. Edward Suh · Kamalika Chaudhuri -
2023 : On Robustness-Accuracy Characterization of Large Language Models using Synthetic Datasets »
Ching-Yun (Irene) Ko · Pin-Yu Chen · Payel Das · Yung-Sung Chuang · Luca Daniel -
2023 : Differentially Private Generation of High Fidelity Samples From Diffusion Models »
Vikash Sehwag · Ashwinee Panda · Ashwini Pokle · Xinyu Tang · Saeed Mahloujifar · Mung Chiang · Zico Kolter · Prateek Mittal -
2023 : Which Features are Learned by Contrastive Learning? On the Role of Simplicity Bias in Class Collapse and Feature Suppression »
Yihao Xue · Siddharth Joshi · Eric Gan · Pin-Yu Chen · Baharan Mirzasoleiman -
2023 : Visual Adversarial Examples Jailbreak Aligned Large Language Models »
Xiangyu Qi · Kaixuan Huang · Ashwinee Panda · Mengdi Wang · Prateek Mittal -
2023 Oral: Patch-level Routing in Mixture-of-Experts is Provably Sample-efficient for Convolutional Neural Networks »
Mohammed Nowaz Rabbani Chowdhury · Shuai Zhang · Meng Wang · Sijia Liu · Pin-Yu Chen -
2023 Poster: Which Features are Learnt by Contrastive Learning? On the Role of Simplicity Bias in Class Collapse and Feature Suppression »
Yihao Xue · Siddharth Joshi · Eric Gan · Pin-Yu Chen · Baharan Mirzasoleiman -
2023 Poster: Compressed Decentralized Proximal Stochastic Gradient Method for Nonconvex Composite Problems with Heterogeneous Data »
Yonggui Yan · Jie Chen · Pin-Yu Chen · Xiaodong Cui · Songtao Lu · Yangyang Xu -
2023 Poster: Identification of the Adversary from a Single Adversarial Example »
Minhao Cheng · Rui Min · Haochen Sun · Pin-Yu Chen -
2023 Poster: Effectively Using Public Data in Privacy Preserving Machine Learning »
Milad Nasresfahani · Saeed Mahloujifar · Xinyu Tang · Prateek Mittal · Amir Houmansadr -
2023 Oral: Which Features are Learnt by Contrastive Learning? On the Role of Simplicity Bias in Class Collapse and Feature Suppression »
Yihao Xue · Siddharth Joshi · Eric Gan · Pin-Yu Chen · Baharan Mirzasoleiman -
2023 Poster: Patch-level Routing in Mixture-of-Experts is Provably Sample-efficient for Convolutional Neural Networks »
Mohammed Nowaz Rabbani Chowdhury · Shuai Zhang · Meng Wang · Sijia Liu · Pin-Yu Chen -
2023 Poster: Reprogramming Pretrained Language Models for Antibody Sequence Infilling »
Igor Melnyk · Vijil Chenthamarakshan · Pin-Yu Chen · Payel Das · Amit Dhurandhar · Inkit Padhi · Devleena Das -
2023 Poster: Uncovering Adversarial Risks of Test-Time Adaptation »
Tong Wu · Feiran Jia · Xiangyu Qi · Jiachen Wang · Vikash Sehwag · Saeed Mahloujifar · Prateek Mittal -
2022 : Learner Knowledge Levels in Adversarial Machine Learning »
Sophie Dai · Prateek Mittal -
2022 Poster: Neurotoxin: Durable Backdoors in Federated Learning »
Zhengming Zhang · Ashwinee Panda · Linyue Song · Yaoqing Yang · Michael Mahoney · Prateek Mittal · Kannan Ramchandran · Joseph E Gonzalez -
2022 Spotlight: Neurotoxin: Durable Backdoors in Federated Learning »
Zhengming Zhang · Ashwinee Panda · Linyue Song · Yaoqing Yang · Michael Mahoney · Prateek Mittal · Kannan Ramchandran · Joseph E Gonzalez -
2021 Poster: Lower Bounds on Cross-Entropy Loss in the Presence of Test-time Adversaries »
Arjun Nitin Bhagoji · Daniel Cullina · Vikash Sehwag · Prateek Mittal -
2021 Spotlight: Lower Bounds on Cross-Entropy Loss in the Presence of Test-time Adversaries »
Arjun Nitin Bhagoji · Daniel Cullina · Vikash Sehwag · Prateek Mittal -
2019 Poster: Analyzing Federated Learning through an Adversarial Lens »
Arjun Nitin Bhagoji · Supriyo Chakraborty · Prateek Mittal · Seraphin Calo -
2019 Oral: Analyzing Federated Learning through an Adversarial Lens »
Arjun Nitin Bhagoji · Supriyo Chakraborty · Prateek Mittal · Seraphin Calo