Timezone: »
Recent advances in Vision Transformer (ViT) have demonstrated its impressive performance in image classification, which makes it a promising alternative to Convolutional Neural Network (CNN). Unlike CNNs, ViT represents an input image as a sequence of image patches. The patch-based input image representation makes the following question interesting: How does ViT perform when individual input image patches are perturbed with natural corruptions or adversarial perturbations, compared to CNNs? In this submission, we propose to evaluate model robustness to patch-wise perturbations. Two types of patch perturbations are considered to model robustness. One is natural corruptions, which is to test models' robustness under distributional shifts. The other is adversarial perturbations, which are created by an adversary to specifically fool a model to make a wrong prediction. The experimental results on the popular CNNs and ViTs are surprising. We find that ViTs are more robust to naturally corrupted patches than CNNs, whereas they are more vulnerable to adversarial patches. Given the architectural traits of state-of-the-art ViTs and the interesting results above, we propose to add the robustness to natural patch corruption and adversarial patch attack into the robustness benchmark.
Author Information
Jindong Gu (University of Munich)
Volker Tresp (Siemens AG and University of Munich)
Volker Tresp received a Diploma degree from the University of Goettingen, Germany, in 1984 and the M.Sc. and Ph.D. degrees from Yale University, New Haven, CT, in 1986 and 1989 respectively. Since 1989 he is the head of various research teams in machine learning at Siemens, Research and Technology. He filed more than 70 patent applications and was inventor of the year of Siemens in 1996. He has published more than 150 scientific articles and administered over 20 Ph.D. theses. The company Panoratio is a spin-off out of his team. His research focus in recent years has been „Machine Learning in Information Networks“ for modelling Knowledge Graphs, medical decision processes and sensor networks. He is the coordinator of one of the first nationally funded Big Data projects for the realization of „Precision Medicine“. Since 2011 he is also a Professor at the Ludwig Maximilian University of Munich where he teaches an annual course on Machine Learning.
Yao Qin (Google)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 : Evaluating Model Robustness to Patch Perturbations »
Dates n/a. Room
More from the Same Authors
-
2021 : What Are Effective Labels for Augmented Data? Improving Calibration and Robustness with AutoLabel »
Yao Qin · Jasper Snoek · Balaji Lakshminarayanan -
2021 : What Are Effective Labels for Augmented Data? Improving Calibration and Robustness with AutoLabel »
Yao Qin · Jasper Snoek -
2019 Poster: Maximum Entropy-Regularized Multi-Goal Reinforcement Learning »
Rui Zhao · Xudong Sun · Volker Tresp -
2019 Oral: Maximum Entropy-Regularized Multi-Goal Reinforcement Learning »
Rui Zhao · Xudong Sun · Volker Tresp -
2017 Poster: Tensor-Train Recurrent Neural Networks for Video Classification »
Yinchong Yang · Denis Krompass · Volker Tresp -
2017 Talk: Tensor-Train Recurrent Neural Networks for Video Classification »
Yinchong Yang · Denis Krompass · Volker Tresp