Timezone: »
On Frank-Wolfe Adversarial Training
Theodoros Tsiligkaridis · Jay Roberts
We develop a theoretical framework for adversarial training (AT) with FW optimization (FW-AT) that reveals a geometric connection between the loss landscape and the distortion of $\ell_\infty$ FW attacks (the attack's $\ell_2$ norm). Specifically, we show that high distortion of FW attacks is equivalent to low variation along the attack path. It is then experimentally demonstrated on various deep neural network architectures that $\ell_\infty$ attacks against robust models achieve near maximal $\ell_2$ distortion. To demonstrate the utility of our theoretical framework we develop FW-Adapt, a novel adversarial training algorithm which uses simple distortion measure to adapt the number of attack steps during training. FW-Adapt provides strong robustness against white- and black-box attacks at lower training times than PGD-AT.
Author Information
Theodoros Tsiligkaridis (MIT Lincoln Laboratory, Massachusetts Institute of Technology)
Jay Roberts (Massachusetts Institute of Technology)
More from the Same Authors
-
2023 : ERM++: An Improved Baseline for Domain Generalization »
Piotr Teterwak · Kuniaki Saito · Theodoros Tsiligkaridis · Kate Saenko · Bryan Plummer -
2023 Poster: Domain Adaptation for Time Series Under Feature and Label Shifts »
Huan He · Owen Queen · Teddy Koker · Consuelo Cuevas · Theodoros Tsiligkaridis · Marinka Zitnik -
2023 Poster: Supervised Metric Learning to Rank for Retrieval via Contextual Similarity Optimization »
Christopher Liao · Theodoros Tsiligkaridis · Brian Kulis