Timezone: »
The application of machine learning in safety-critical systems requires a reliable assessment of uncertainty. However, deep neural networks are known to produce highly overconfident predictions on out-of-distribution (OOD) data. Even if trained to be non-confident on OOD data one can still adversarially manipulate OOD data so that the classifier again assigns high confidence to the manipulated samples. We show that two previously published defenses can be broken by better adapted attacks, highlighting the importance of robustness guarantees around OOD data. Since the existing method for this task is hard to train and significantly limits accuracy, we construct a classifier that can simultaneously achieve provability and high clean accuracy. Moreover, by architectural construction our method provably avoids the asymptotic overconfidence problem of standard neural networks.
Author Information
Alexander Meinke (University of Tübingen)
Julian Bitterwolf (University of Tübingen)
Matthias Hein (University of Tübingen)
More from the Same Authors
-
2022 : Sound randomized smoothing in floating-point arithmetics »
Václav Voráček · Matthias Hein -
2022 : Sound randomized smoothing in floating-point arithmetics »
Václav Voráček · Matthias Hein -
2022 : Classifiers Should Do Well Even on Their Worst Classes »
Julian Bitterwolf · Alexander Meinke · Valentyn Boreiko · Matthias Hein -
2022 : Lost in Translation: Modern Image Classifiers still degrade even under simple Translations »
Leander Kurscheidt · Matthias Hein -
2023 Poster: In or Out? Fixing ImageNet Out-of-Distribution Detection Evaluation »
Julian Bitterwolf · Maximilian Müller · Matthias Hein -
2023 Poster: A modern look at the relationship between sharpness and generalization »
Maksym Andriushchenko · Francesco Croce · Maximilian Müller · Matthias Hein · Nicolas Flammarion -
2023 Poster: Improving $\ell_1$-Certified Robustness via Randomized Smoothing by Leveraging Box Constraints »
Václav Voráček · Matthias Hein -
2022 : Closing remarks »
Evgenia Rusak · Roland S. Zimmermann · Julian Bitterwolf · Steffen Schneider -
2022 : Lost in Translation: Modern Image Classifiers still degrade even under simple Translations »
Leander Kurscheidt · Matthias Hein -
2022 : Classifiers Should Do Well Even on Their Worst Classes »
Julian Bitterwolf · Alexander Meinke · Valentyn Boreiko · Matthias Hein -
2022 : On the interplay of adversarial robustness and architecture components: patches, convolution and attention »
Francesco Croce · Matthias Hein -
2022 Workshop: Shift happens: Crowdsourcing metrics and test datasets beyond ImageNet »
Roland S. Zimmermann · Julian Bitterwolf · Evgenia Rusak · Steffen Schneider · Matthias Bethge · Wieland Brendel · Matthias Hein -
2022 : Introduction and opening remarks »
Julian Bitterwolf · Roland S. Zimmermann · Steffen Schneider · Evgenia Rusak -
2022 Poster: Breaking Down Out-of-Distribution Detection: Many Methods Based on OOD Training Data Estimate a Combination of the Same Core Quantities »
Julian Bitterwolf · Alexander Meinke · Maximilian Augustin · Matthias Hein -
2022 Spotlight: Breaking Down Out-of-Distribution Detection: Many Methods Based on OOD Training Data Estimate a Combination of the Same Core Quantities »
Julian Bitterwolf · Alexander Meinke · Maximilian Augustin · Matthias Hein -
2022 Poster: Adversarial Robustness against Multiple and Single $l_p$-Threat Models via Quick Fine-Tuning of Robust Classifiers »
Francesco Croce · Matthias Hein -
2022 Poster: Provably Adversarially Robust Nearest Prototype Classifiers »
Václav Voráček · Matthias Hein -
2022 Poster: Evaluating the Adversarial Robustness of Adaptive Test-time Defenses »
Francesco Croce · Sven Gowal · Thomas Brunner · Evan Shelhamer · Matthias Hein · Taylan Cemgil -
2022 Spotlight: Adversarial Robustness against Multiple and Single $l_p$-Threat Models via Quick Fine-Tuning of Robust Classifiers »
Francesco Croce · Matthias Hein -
2022 Spotlight: Evaluating the Adversarial Robustness of Adaptive Test-time Defenses »
Francesco Croce · Sven Gowal · Thomas Brunner · Evan Shelhamer · Matthias Hein · Taylan Cemgil -
2022 Spotlight: Provably Adversarially Robust Nearest Prototype Classifiers »
Václav Voráček · Matthias Hein -
2021 : Discussion Panel #1 »
Hang Su · Matthias Hein · Liwei Wang · Sven Gowal · Jan Hendrik Metzen · Henry Liu · Yisen Wang -
2021 : Invited Talk #3 »
Matthias Hein -
2021 : Provably Robust Detection of Out-of-distribution Data (almost) for free »
Alexander Meinke -
2021 Poster: Mind the Box: $l_1$-APGD for Sparse Adversarial Attacks on Image Classifiers »
Francesco Croce · Matthias Hein -
2021 Spotlight: Mind the Box: $l_1$-APGD for Sparse Adversarial Attacks on Image Classifiers »
Francesco Croce · Matthias Hein -
2020 : Keynote #1 Matthias Hein »
Matthias Hein -
2020 Poster: Minimally distorted Adversarial Examples with a Fast Adaptive Boundary Attack »
Francesco Croce · Matthias Hein -
2020 Poster: Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks »
Francesco Croce · Matthias Hein -
2020 Poster: Being Bayesian, Even Just a Bit, Fixes Overconfidence in ReLU Networks »
Agustinus Kristiadi · Matthias Hein · Philipp Hennig -
2020 Poster: Confidence-Calibrated Adversarial Training: Generalizing to Unseen Attacks »
David Stutz · Matthias Hein · Bernt Schiele -
2019 Poster: Spectral Clustering of Signed Graphs via Matrix Power Means »
Pedro Mercado · Francesco Tudisco · Matthias Hein -
2019 Oral: Spectral Clustering of Signed Graphs via Matrix Power Means »
Pedro Mercado · Francesco Tudisco · Matthias Hein