SPADE: A Spectral Method for Black-Box Adversarial Robustness Evaluation

Wuxinlin Cheng · Chenhui Deng · Zhiqiang Zhao · Yaohui Cai · Zhiru Zhang · Zhuo Feng

Keywords: [ Adversarial Examples ] [ Algorithms ] [ Visualization or Exposition Techniques for Deep Networks ] [ Algorithms -> Image Segmentation; Applications -> Computer Vision; Applications -> Image Segmentation; Applications ] [ Visual S ]

[ Abstract ]
[ Paper ]
[ Visit Poster at Spot D1 in Virtual World ]
Tue 20 Jul 9 a.m. PDT — 11 a.m. PDT
Spotlight presentation: Deep Learning Applications
Tue 20 Jul 5 a.m. PDT — 6 a.m. PDT


A black-box spectral method is introduced for evaluating the adversarial robustness of a given machine learning (ML) model. Our approach, named SPADE, exploits bijective distance mapping between the input/output graphs constructed for approximating the manifolds corresponding to the input/output data. By leveraging the generalized Courant-Fischer theorem, we propose a SPADE score for evaluating the adversarial robustness of a given model, which is proved to be an upper bound of the best Lipschitz constant under the manifold setting. To reveal the most non-robust data samples highly vulnerable to adversarial attacks, we develop a spectral graph embedding procedure leveraging dominant generalized eigenvectors. This embedding step allows assigning each data point a robustness score that can be further harnessed for more effective adversarial training of ML models. Our experiments show promising empirical results for neural networks trained with the MNIST and CIFAR-10 data sets.

Chat is not available.