ICML Poster Adversarial Parameter Attack on Deep Neural Networks

Poster

Adversarial Parameter Attack on Deep Neural Networks

Lijia Yu · Yihan Wang · Xiao-Shan Gao

Exhibit Hall 1 #527

[ Abstract ] [ Project Page ]

[ PDF] [ Poster]

Abstract: The parameter perturbation attack is a safety threat to deep learning, where small parameter perturbations are made such that the attacked network gives wrong or desired labels of the adversary to specified inputs. However, such attacks could be detected by the user, because the accuracy of the attacked network will reduce and the network cannot work normally. To make the attack more stealthy, in this paper, the adversarial parameter attack is proposed, in which small perturbations to the parameters of the network are made such that the accuracy of the attacked network does not decrease much, but its robustness against adversarial example attacks becomes much lower. As a consequence, the attacked network performs normally on standard samples, but is much more vulnerable to adversarial attacks. The existence of nearly perfect adversarial parameters under

L_{\infty}

$L_\infty$ norm and

L_{0}

$L_0$ norm is proved under reasonable conditions. Algorithms are given which can be used to produce high quality adversarial parameters for the commonly used networks trained with various robust training methods, in that the robustness of the attacked networks decreases significantly when they are evaluated using various adversarial attack methods.

Chat is not available.