Timezone: »
Poster
Confidence-Calibrated Adversarial Training: Generalizing to Unseen Attacks
David Stutz · Matthias Hein · Bernt Schiele
Tue Jul 14 01:00 PM -- 01:45 PM & Wed Jul 15 02:00 AM -- 02:45 AM (PDT) @ Virtual
Adversarial training yields robust models against a specific threat model, e.g., $L_\infty$ adversarial examples. Typically robustness does not generalize to previously unseen threat models, e.g., other $L_p$ norms, or larger perturbations. Our confidence-calibrated adversarial training (CCAT) tackles this problem by biasing the model towards low confidence predictions on adversarial examples. By allowing to reject examples with low confidence, robustness generalizes beyond the threat model employed during training. CCAT, trained only on $L_\infty$ adversarial examples, increases robustness against larger $L_\infty$, $L_2$, $L_1$ and $L_0$ attacks, adversarial frames, distal adversarial examples and corrupted examples and yields better clean accuracy compared to adversarial training. For thorough evaluation we developed novel white- and black-box attacks directly attacking CCAT by maximizing confidence. For each threat model, we use $7$ attacks with up to $50$ restarts and $5000$ iterations and report worst-case robust test error, extended to our confidence-thresholded setting, across all attacks.
Author Information
David Stutz (Max Planck Institute for Informatics)
Matthias Hein (University of Tübingen)
Bernt Schiele (MPI Informatics)
More from the Same Authors
-
2021 : A Closer Look at the Adversarial Robustness of Information Bottleneck Models »
Iryna Korshunova · David Stutz · Alexander Alemi · Olivia Wiles · Sven Gowal -
2022 : Provably Adversarially Robust Detection of Out-of-Distribution Data (Almost) for Free »
Alexander Meinke · Julian Bitterwolf · Matthias Hein -
2022 : Sound randomized smoothing in floating-point arithmetics »
Václav Voráček · Matthias Hein -
2022 : Are We Viewing the Problem of Robust Generalisation through the Appropriate Lens? »
Mohamed Omran · Bernt Schiele -
2022 : Sound randomized smoothing in floating-point arithmetics »
Václav Voráček · Matthias Hein -
2022 : Classifiers Should Do Well Even on Their Worst Classes »
Julian Bitterwolf · Alexander Meinke · Valentyn Boreiko · Matthias Hein -
2022 : Lost in Translation: Modern Image Classifiers still degrade even under simple Translations »
Leander Kurscheidt · Matthias Hein -
2022 : Towards Systematic Robustness for Scalable Visual Recognition »
Mohamed Omran · Bernt Schiele -
2022 : Are We Viewing the Problem of Robust Generalisation through the Appropriate Lens? »
Mohamed Omran · Bernt Schiele -
2023 : Robust Semantic Segmentation: Strong Adversarial Attacks and Fast Training of Robust Models »
Francesco Croce · Naman Singh · Matthias Hein -
2023 : In or Out? Fixing ImageNet Out-of-Distribution Detection Evaluation »
Julian Bitterwolf · Maximilian Müller · Matthias Hein -
2023 Poster: In or Out? Fixing ImageNet Out-of-Distribution Detection Evaluation »
Julian Bitterwolf · Maximilian Müller · Matthias Hein -
2023 Poster: A Modern Look at the Relationship between Sharpness and Generalization »
Maksym Andriushchenko · Francesco Croce · Maximilian Müller · Matthias Hein · Nicolas Flammarion -
2023 Poster: Improving l1-Certified Robustness via Randomized Smoothing by Leveraging Box Constraints »
Václav Voráček · Matthias Hein -
2022 : Lost in Translation: Modern Image Classifiers still degrade even under simple Translations »
Leander Kurscheidt · Matthias Hein -
2022 : Towards Systematic Robustness for Scalable Visual Recognition »
Mohamed Omran · Bernt Schiele -
2022 : Classifiers Should Do Well Even on Their Worst Classes »
Julian Bitterwolf · Alexander Meinke · Valentyn Boreiko · Matthias Hein -
2022 : On the interplay of adversarial robustness and architecture components: patches, convolution and attention »
Francesco Croce · Matthias Hein -
2022 Workshop: Shift happens: Crowdsourcing metrics and test datasets beyond ImageNet »
Roland S. Zimmermann · Julian Bitterwolf · Evgenia Rusak · Steffen Schneider · Matthias Bethge · Wieland Brendel · Matthias Hein -
2022 Poster: Breaking Down Out-of-Distribution Detection: Many Methods Based on OOD Training Data Estimate a Combination of the Same Core Quantities »
Julian Bitterwolf · Alexander Meinke · Maximilian Augustin · Matthias Hein -
2022 Spotlight: Breaking Down Out-of-Distribution Detection: Many Methods Based on OOD Training Data Estimate a Combination of the Same Core Quantities »
Julian Bitterwolf · Alexander Meinke · Maximilian Augustin · Matthias Hein -
2022 Poster: Adversarial Robustness against Multiple and Single $l_p$-Threat Models via Quick Fine-Tuning of Robust Classifiers »
Francesco Croce · Matthias Hein -
2022 Poster: Provably Adversarially Robust Nearest Prototype Classifiers »
Václav Voráček · Matthias Hein -
2022 Poster: Evaluating the Adversarial Robustness of Adaptive Test-time Defenses »
Francesco Croce · Sven Gowal · Thomas Brunner · Evan Shelhamer · Matthias Hein · Taylan Cemgil -
2022 Spotlight: Adversarial Robustness against Multiple and Single $l_p$-Threat Models via Quick Fine-Tuning of Robust Classifiers »
Francesco Croce · Matthias Hein -
2022 Spotlight: Evaluating the Adversarial Robustness of Adaptive Test-time Defenses »
Francesco Croce · Sven Gowal · Thomas Brunner · Evan Shelhamer · Matthias Hein · Taylan Cemgil -
2022 Spotlight: Provably Adversarially Robust Nearest Prototype Classifiers »
Václav Voráček · Matthias Hein -
2021 : Discussion Panel #1 »
Hang Su · Matthias Hein · Liwei Wang · Sven Gowal · Jan Hendrik Metzen · Henry Liu · Yisen Wang -
2021 : Invited Talk #3 »
Matthias Hein -
2021 Poster: Mind the Box: $l_1$-APGD for Sparse Adversarial Attacks on Image Classifiers »
Francesco Croce · Matthias Hein -
2021 Spotlight: Mind the Box: $l_1$-APGD for Sparse Adversarial Attacks on Image Classifiers »
Francesco Croce · Matthias Hein -
2020 : Contributed Talk 1: Confidence-Calibrated Adversarial Training: Generalizing to Unseen Attacks »
David Stutz -
2020 : Keynote #1 Matthias Hein »
Matthias Hein -
2020 Poster: Minimally distorted Adversarial Examples with a Fast Adaptive Boundary Attack »
Francesco Croce · Matthias Hein -
2020 Poster: Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks »
Francesco Croce · Matthias Hein -
2020 Poster: Being Bayesian, Even Just a Bit, Fixes Overconfidence in ReLU Networks »
Agustinus Kristiadi · Matthias Hein · Philipp Hennig -
2019 : Spotlight »
Tyler Scott · Kiran Thekumparampil · Jonathan Aigrain · Rene Bidart · Priyadarshini Panda · Dian Ang Yap · Yaniv Yacoby · Raphael Gontijo Lopes · Alberto Marchisio · Erik Englesson · Wanqian Yang · Moritz Graule · Yi Sun · Daniel Kang · Mike Dusenberry · Min Du · Hartmut Maennel · Kunal Menda · Vineet Edupuganti · Luke Metz · David Stutz · Vignesh Srinivasan · Timo Sämann · Vineeth N Balasubramanian · Sina Mohseni · Rob Cornish · Judith Butepage · Zhangyang Wang · Bai Li · Bo Han · Honglin Li · Maksym Andriushchenko · Lukas Ruff · Meet P. Vadera · Yaniv Ovadia · Sunil Thulasidasan · Disi Ji · Gang Niu · Saeed Mahloujifar · Aviral Kumar · SANGHYUK CHUN · Dong Yin · Joyce Xu Xu · Hugo Gomes · Raanan Rohekar -
2019 Poster: Spectral Clustering of Signed Graphs via Matrix Power Means »
Pedro Mercado · Francesco Tudisco · Matthias Hein -
2019 Oral: Spectral Clustering of Signed Graphs via Matrix Power Means »
Pedro Mercado · Francesco Tudisco · Matthias Hein