ICML Poster Adversarial Perturbations Are Formed by Iteratively Learning Linear Combinations of the Right Singular Vectors of the Adversarial Jacobian

Poster

Adversarial Perturbations Are Formed by Iteratively Learning Linear Combinations of the Right Singular Vectors of the Adversarial Jacobian

Thomas Paniagua · Chinmay Savadikar · Tianfu Wu

East Exhibition Hall A-B #E-2206

[ Abstract ] [ Lay Summary ]

[ Slides] [ Poster] [ OpenReview]

Tue 15 Jul 11 a.m. PDT — 1:30 p.m. PDT

Abstract: White-box targeted adversarial attacks reveal core vulnerabilities in Deep Neural Networks (DNNs), yet two key challenges persist: (i) How many target classes can be attacked simultaneously in a specified order, known as the *ordered top-$K$ attack* problem ($K \geq 1$)? (ii) How to compute the corresponding adversarial perturbations for a given benign image directly in the image space?We address both by showing that *ordered top-$K$ perturbations can be learned via iteratively optimizing linear combinations of the $\underline{ri}ght\text{ } \underline{sing}ular$ vectors of the adversarial Jacobian* (i.e., the logit-to-image Jacobian constrained by target ranking). These vectors span an orthogonal, informative subspace in the image domain.We introduce **RisingAttacK**, a novel Sequential Quadratic Programming (SQP)-based method that exploits this structure. We propose a holistic figure-of-merits (FoM) metric combining attack success rates (ASRs) and $\ell_p$-norms ($p=1,2,\infty$).Extensive experiments on ImageNet-1k across six ordered top-$K$ levels ($K=1, 5, 10, 15, 20, 25, 30$) and four models (ResNet-50, DenseNet-121, ViT-B, DEiT-B) show RisingAttacK consistently surpasses the state-of-the-art QuadAttacK.

Lay Summary:

Deep neural networks (DNNs) are highly accurate but remain vulnerable to adversarial attacks—small, often imperceptible changes to input images that cause incorrect outputs. While most attacks focus on altering the top-1 prediction, many real-world systems (e.g., search engines, medical triage) rely on the entire ranked list of outputs. This raises a key question: how can we trick a DNN to produce an ordered set of incorrect predictions?We address this with RisingAttacK, a novel method that directly learns adversarial perturbations in image space. Using Sequential Quadratic Programming, it optimizes minimal, interpretable changes that manipulate the model’s top-K ranking. The attack leverages linear combinations of the most sensitive directions—derived from the adversarial Jacobian—to efficiently disrupt the model’s output ordering.RisingAttacK consistently outperforms prior state-of-the-art attacks across four major models and ranking depths (K = 1 to 30), achieving higher success rates and lower perturbation norms.By enabling precise manipulation of ranked outputs, our method delivers the kind of comprehensive stress tests increasingly demanded by regulators and practitioners—tests that top-1-only attacks simply cannot provide.

Chat is not available.