ICML Poster Provably Adversarially Robust Nearest Prototype Classifiers

Poster

Provably Adversarially Robust Nearest Prototype Classifiers

Václav Voráček · Matthias Hein

Hall E #227

Keywords: [ SA: Trustworthy Machine Learning ] [ DL: Robustness ]

[ Abstract ]

[ Poster] [ Paper PDF]

Abstract: Nearest prototype classifiers (NPCs) assign to each input point the label of the nearest prototype with respect to a chosen distance metric. A direct advantage of NPCs is that the decisions are interpretable. Previous work could provide lower bounds on the minimal adversarial perturbation in the

ℓ_{p}

$\ell_p$ -threat model when using the same

ℓ_{p}

$\ell_p$ -distance for the NPCs. In this paper we provide a complete discussion on the complexity when using

ℓ_{p}

$\ell_p$ -distances for decision and

ℓ_{q}

$\ell_q$ -threat models for certification for

p, q \in {1, 2, \infty}

$p,q \in \{1,2,\infty\}$ . In particular we provide scalable algorithms for the \emph{exact} computation of the minimal adversarial perturbation when using

ℓ_{2}

$\ell_2$ -distance and improved lower bounds in other cases. Using efficient improved lower bounds we train our \textbf{P}rovably adversarially robust \textbf{NPC} (PNPC), for MNIST which have better

ℓ_{2}

$\ell_2$ -robustness guarantees than neural networks. Additionally, we show up to our knowledge the first certification results w.r.t. to the LPIPS perceptual metric which has been argued to be a more realistic threat model for image classification than

ℓ_{p}

$\ell_p$ -balls. Our PNPC has on CIFAR10 higher certified robust accuracy than the empirical robust accuracy reported in \cite{laidlaw2021perceptual}. The code is available in our~\href{https://github.com/vvoracek/Provably-Adversarially-Robust-Nearest-Prototype-Classifiers}{repository}.

Chat is not available.