Skip to yearly menu bar Skip to main content


Oral

Certified Adversarial Robustness via Randomized Smoothing

Jeremy Cohen · Elan Rosenfeld · Zico Kolter

Abstract:

Recent work has used randomization to create classifiers that are provably robust to adversarial perturbations with small L2 norm. However, existing guarantees for such classifiers are unnecessarily loose. In this work we provide the first tight analysis for these "randomized smoothing" classifiers. We then use the method to train an ImageNet classifier with e.g. a provable top-1 accuracy of 59% under adversarial perturbations with L2 norm less than 57/255. No other provable adversarial defense has been shown to be feasible on ImageNet. On the smaller-scale datasets where alternative approaches are viable, randomized smoothing outperforms all alternatives by a large margin. While our specific method can only certify robustness in L2 norm, the empirical success of the approach suggests that provable methods based on randomization are a promising direction for future research into adversarially robust classification.

Chat is not available.