Timezone: »

Generalizing Adversarial Training to Composite Semantic Perturbations
Yun-Yun Tsai · Lei Hsiung · Pin-Yu Chen · Tsung-Yi Ho
Model robustness against adversarial examples has been widely studied, yet the lack of generalization to more realistic scenarios can be challenging. Specifically, recent works using adversarial training can successfully improve model robustness, but these works primarily consider adversarial threat models limited to $\ell_{p}$-norm bounded perturbations and might overlook semantic perturbations and their composition. In this paper, we firstly propose a novel method for generating composite adversarial examples. By utilizing component-wise PGD update and automatic attack- order scheduling, our method can find the optimal attack composition. We then propose generalized adversarial training (GAT) to extend model robustness from $\ell_{p}$ norm to composite semantic perturbations, such as Hue, Saturation, Brightness, Contrast, and Rotation. The results show that GAT can be robust not only on any single attack but also on combination of multiple attacks. GAT also outperforms baseline adversarial training approaches by a significant margin.

Author Information

Yun-Yun Tsai (Columbia University)
Lei Hsiung (National Tsing Hua University)
Pin-Yu Chen (IBM Research AI)
Tsung-Yi Ho (National Tsing Hua University)

More from the Same Authors