Timezone: »

 
Towards Out-of-Distribution Adversarial Robustness
Adam Ibrahim · Charles Guille-Escuret · Ioannis Mitliagkas · Irina Rish · David Krueger · Pouya Bashivan
Adversarial robustness continues to be a major challenge for deep learning. A core issue is that robustness to one type of attack often fails to transfer to other attacks. While prior work establishes a theoretical trade-off in robustness against different $L_p$ norms, we show that there is space for improvement against many commonly used attacks by adopting a domain generalisation approach. In particular, we treat different attacks as domains, and apply the method of Risk Extrapolation (REx), which encourages similar levels of robustness against all training attacks. Compared to existing methods, we obtain similar or superior adversarial robustness on attacks seen during training. More significantly, we achieve superior performance on families or tunings of attacks only encountered at test time. On ensembles of attacks, this improves the accuracy from 3.4\% on the best existing baseline to 25.9\% on MNIST, and from 10.7\% to 17.9\% on CIFAR10.

Author Information

Adam Ibrahim (Mila, Université de Montréal)
Charles Guille-Escuret (Mila, Université de Montréal)
Ioannis Mitliagkas (University of Montreal)
Irina Rish (MILA / Université de Montréal h)
David Krueger (University of Cambridge)
Pouya Bashivan (McGill University)

More from the Same Authors