Timezone: »
Poster
Detecting Adversarial Examples Is (Nearly) As Hard As Classifying Them
Florian Tramer
Making classifiers robust to adversarial examples is challenging. Thus, many works tackle the seemingly easier task of \emph{detecting} perturbed inputs.We show a barrier towards this goal. We prove a \emph{hardness reduction} between detection and classification of adversarial examples: given a robust detector for attacks at distance $\epsilon$ (in some metric), we show how to build a similarly robust (but inefficient) \emph{classifier} for attacks at distance $\epsilon/2$.Our reduction is \emph{computationally} inefficient, but preserves the \emph{data complexity} of the original detector. The reduction thus cannot be directly used to build practical classifiers.Instead, it is a useful sanity check to test whether empirical detection results imply something much stronger than the authors presumably anticipated (namely a highly robust and data-efficient \emph{classifier}).To illustrate, we revisit $14$ empirical detector defenses published over the past years. For $12/14$ defenses, we show that the claimed detection results imply an inefficient classifier with robustness far beyond the state-of-the-art--- thus casting some doubts on the results' validity.Finally, we show that our reduction applies in both directions: a robust classifier for attacks at distance $\epsilon/2$ implies an inefficient robust detector at distance $\epsilon$. Thus, we argue that robust classification and robust detection should be regarded as (near)-equivalent problems, if we disregard their \emph{computational} complexity.
Author Information
Florian Tramer (Google)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Oral: Detecting Adversarial Examples Is (Nearly) As Hard As Classifying Them »
Tue. Jul 19th 02:30 -- 02:50 PM Room Hall G
More from the Same Authors
-
2021 : Data Poisoning Won’t Save You From Facial Recognition »
Evani Radiya-Dixit · Florian Tramer -
2021 : Detecting Adversarial Examples Is (Nearly) As Hard As Classifying Them »
Florian Tramer -
2023 : Backdoor Attacks for In-Context Learning with Language Models »
Nikhil Kandpal · Matthew Jagielski · Florian Tramer · Nicholas Carlini -
2023 : Evading Black-box Classifiers Without Breaking Eggs »
Edoardo Debenedetti · Nicholas Carlini · Florian Tramer -
2023 Poster: Preprocessors Matter! Realistic Decision-Based Attacks on Machine Learning Systems »
Chawin Sitawarin · Florian Tramer · Nicholas Carlini -
2021 : Contributed Talk #4 »
Florian Tramer -
2021 Poster: Label-Only Membership Inference Attacks »
Christopher Choquette-Choo · Florian Tramer · Nicholas Carlini · Nicolas Papernot -
2021 Spotlight: Label-Only Membership Inference Attacks »
Christopher Choquette-Choo · Florian Tramer · Nicholas Carlini · Nicolas Papernot -
2020 Poster: Fundamental Tradeoffs between Invariance and Sensitivity to Adversarial Perturbations »
Florian Tramer · Jens Behrmann · Nicholas Carlini · Nicolas Papernot · Joern-Henrik Jacobsen -
2019 Workshop: Workshop on the Security and Privacy of Machine Learning »
Nicolas Papernot · Florian Tramer · Bo Li · Dan Boneh · David Evans · Somesh Jha · Percy Liang · Patrick McDaniel · Jacob Steinhardt · Dawn Song