Timezone: »

 
Indicators of Attack Failure: Debugging and Improving Optimization of Adversarial Examples
Maura Pintor · Luca Demetrio · Angelo Sotgiu · Giovanni Manca · Ambra Demontis · Nicholas Carlini · Battista Biggio · Fabio Roli

Evaluating robustness of machine-learning models to adversarial examples is a challenging problem. Many defenses have been shown to provide a false sense of security by causing gradient-based attacks to fail, and they have been broken under more rigorous evaluations. Although guidelines and best practices have been suggested to improve current adversarial robustness evaluations, the lack of automatic testing and debugging tools makes it difficult to apply these recommendations in a systematic manner. In this work, we overcome these limitations by (i) defining a set of quantitative indicators which unveil common failures in the optimization of gradient-based attacks, and (ii) proposing specific mitigation strategies within a systematic evaluation protocol. Our extensive experimental analysis shows that the proposed indicators of failure can be used to visualize, debug and improve current adversarial robustness evaluations, providing a first concrete step towards automatizing and systematizing current adversarial robustness evaluations.

Author Information

Maura Pintor (University of Cagliari)

Maura Pintor is a Postdoctoral Researcher at the PRA Lab, in the Department of Electrical and Electronic Engineering of the University of Cagliari, Italy. She received the MSc degree in Telecommunications Engineering with honors in 2018 and the PhD degree in Electronic and Computer Engineering from the University of Cagliari in 2022. Her PhD thesis, "Towards Debugging and Improving Adversarial Robustness Evaluations", provides a framework for optimizing and debugging adversarial attacks. She is co-author of the paper "Why Do Adversarial Attacks Transfer? Explaining Transferability of Evasion and Poisoning Attacks", accepted at USENIX Sec. 2019, and of the paper "Fast Minimum-norm Adversarial Attacks through Adaptive Norm Constraints", accepted at NeurIPS 2021. She was a visiting student at Eberhard Karls Universitaet Tuebingen from March to June 2020. She has collaborated with Pluribus One in the EU H2020 projects ALOHA and AssureMOSS.

Luca Demetrio (Università degli Studi di Cagliari)
Angelo Sotgiu (University of Cagliari)
Giovanni Manca (Università degli studi di Cagliari)
Ambra Demontis (University of Cagliari)
Nicholas Carlini (Google)
Battista Biggio (University of Cagliari, Italy)
Fabio Roli (University of Cagliari)

More from the Same Authors