ICML Characterizing the Optimal $0-1$ Loss for Multi-class Classification with a Test-time Attacker

Oral
in
Workshop: 2nd ICML Workshop on New Frontiers in Adversarial Machine Learning

Characterizing the Optimal $0-1$ Loss for Multi-class Classification with a Test-time Attacker

Keywords: [ Adversarial Robustness ] [ Theory ]

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract: Finding classifiers robust to adversarial examples is critical for their safe deployment. Determining the robustness of the best possible classifier under a given threat model for a fixed data distribution and comparing it to thatachieved by state-of-the-art training methods is thus an important diagnostictool. In this paper, we find achievable information-theoretic lower bounds onrobust loss in the presence of a test-time attacker for *multi-classclassifiers on any discrete dataset*. We provide a general framework for findingthe optimal

0 - 1

$0-1$ loss that revolves around the construction of a conflicthypergraph from the data and adversarial constraints. The prohibitive cost ofthis formulation in practice leads us to formulate other variants of theattacker-classifier game that more efficiently determine the range of theoptimal loss. Our valuation shows, for the first time, an analysis of the gap tooptimal robustness for classifiers in the multi-class setting on benchmarkdatasets.

Chat is not available.

Oral in Workshop: 2nd ICML Workshop on New Frontiers in Adversarial Machine Learning

Characterizing the Optimal 0−10−10-1 Loss for Multi-class Classification with a Test-time Attacker

Oral
in
Workshop: 2nd ICML Workshop on New Frontiers in Adversarial Machine Learning

Characterizing the Optimal $0-1$ Loss for Multi-class Classification with a Test-time Attacker