Timezone: »

Provably Adversarially Robust Detection of Out-of-Distribution Data (Almost) for Free
Alexander Meinke · Julian Bitterwolf · Matthias Hein

The application of machine learning in safety-critical systems requires a reliable assessment of uncertainty. However, deep neural networks are known to produce highly overconfident predictions on out-of-distribution (OOD) data. Even if trained to be non-confident on OOD data one can still adversarially manipulate OOD data so that the classifier again assigns high confidence to the manipulated samples. We show that two previously published defenses can be broken by better adapted attacks, highlighting the importance of robustness guarantees around OOD data. Since the existing method for this task is hard to train and significantly limits accuracy, we construct a classifier that can simultaneously achieve provability and high clean accuracy. Moreover, by architectural construction our method provably avoids the asymptotic overconfidence problem of standard neural networks.

Author Information

Alexander Meinke (University of Tübingen)
Julian Bitterwolf (University of Tübingen)
Matthias Hein (University of Tübingen)

More from the Same Authors