SPADE: A Spectral Method for Black-Box Adversarial Robustness Evaluation

Wuxinlin Cheng · Chenhui Deng · Zhiqiang Zhao · Yaohui Cai · Zhiru Zhang · Zhuo Feng

[ Abstract ] [ Livestream: Visit Deep Learning Applications ] [ Paper ]
[ Paper ]

A black-box spectral method is introduced for evaluating the adversarial robustness of a given machine learning (ML) model. Our approach, named SPADE, exploits bijective distance mapping between the input/output graphs constructed for approximating the manifolds corresponding to the input/output data. By leveraging the generalized Courant-Fischer theorem, we propose a SPADE score for evaluating the adversarial robustness of a given model, which is proved to be an upper bound of the best Lipschitz constant under the manifold setting. To reveal the most non-robust data samples highly vulnerable to adversarial attacks, we develop a spectral graph embedding procedure leveraging dominant generalized eigenvectors. This embedding step allows assigning each data point a robustness score that can be further harnessed for more effective adversarial training of ML models. Our experiments show promising empirical results for neural networks trained with the MNIST and CIFAR-10 data sets.

Chat is not available.