Timezone: »
We attempt to interpret how adversarially trained convolutional neural networks (AT-CNNs) recognize objects. We design systematic approaches to interpret AT-CNNs in both qualitative and quantitative ways, and compare them with normally trained models. Surprisingly, we find that adversarial training alleviates the texture bias of standard CNNs when trained on object recognition tasks, and helps CNNs learn a more shape-biased representation.
We validate our hypothesis from two aspects. First, we compare the salience maps of AT-CNNs and standard CNNs on clean images and image under different transformations. The comparison could visually show that the predictions of the two types of CNNs are sensitive to dramatically different types of features. Second, to achieve quantitative verification, we construct additional test datasets that destroy either textures or shapes, such as style-transferred version of clean data, saturated images and patch-shuffled ones, and then evaluate the classification accuracy of AT-CNNs and normal CNNs on these datasets.
Our findings shed some light on why AT-CNNs are more robust than those normally trained ones and contribute to a better understanding of adversarial training over CNNs from an interpretation perspective.
The code for reproducibility is provided in the Supplementary Materials.
Author Information
Tianyuan Zhang (Peking University)
looking for phd position
Zhanxing Zhu (Peking University)
Related Events (a corresponding poster, oral, or spotlight)
-
2019 Poster: Interpreting Adversarially Trained Convolutional Neural Networks »
Thu. Jun 13th 01:30 -- 04:00 AM Room Pacific Ballroom #148
More from the Same Authors
-
2023 Poster: MonoFlow: Rethinking Divergence GANs via the Perspective of Differential Equations »
Mingxuan Yi · Zhanxing Zhu · Song Liu -
2021 Workshop: ICML Workshop on Theoretic Foundation, Criticism, and Application Trend of Explainable AI »
Quanshi Zhang · Tian Han · Lixin Fan · Zhanxing Zhu · Hang Su · Ying Nian Wu -
2021 Poster: Positive-Negative Momentum: Manipulating Stochastic Gradient Noise to Improve Generalization »
Zeke Xie · Li Yuan · Zhanxing Zhu · Masashi Sugiyama -
2021 Spotlight: Positive-Negative Momentum: Manipulating Stochastic Gradient Noise to Improve Generalization »
Zeke Xie · Li Yuan · Zhanxing Zhu · Masashi Sugiyama -
2020 Poster: On Breaking Deep Generative Model-based Defenses and Beyond »
Yanzhi Chen · Renjie Xie · Zhanxing Zhu -
2020 Poster: Informative Dropout for Robust Representation Learning: A Shape-bias Perspective »
Baifeng Shi · Dinghuai Zhang · Qi Dai · Zhanxing Zhu · Yadong Mu · Jingdong Wang -
2020 Poster: On the Noisy Gradient Descent that Generalizes as SGD »
Jingfeng Wu · Wenqing Hu · Haoyi Xiong · Jun Huan · Vladimir Braverman · Zhanxing Zhu -
2019 Poster: The Anisotropic Noise in Stochastic Gradient Descent: Its Behavior of Escaping from Sharp Minima and Regularization Effects »
Zhanxing Zhu · Jingfeng Wu · Bing Yu · Lei Wu · Jinwen Ma -
2019 Oral: The Anisotropic Noise in Stochastic Gradient Descent: Its Behavior of Escaping from Sharp Minima and Regularization Effects »
Zhanxing Zhu · Jingfeng Wu · Bing Yu · Lei Wu · Jinwen Ma