Talk
in
Workshop: New Frontiers in Adversarial Machine Learning
Adversarial attacks on deep learning : Model explanation & transfer to the physical world
Ajmal Mian
Despite their remarkable success, deep models are brittle and can be manipulated easily by corrupting data with carefully crafted perturbations that are largely imperceptible to human observers. In this talk, I will give a brief background of the three stages of attacks on deep models including adversarial perturbations, data poisoning and Trojan models. I will then discuss universal perturbations, including our work on the detection and removal of such perturbations. Next, I will present Label Universal Targeted Attack (LUTA) that is image agnostic but optimized for a particular input and output class. LUTA has interesting properties beyond model fooling and can be extended to explain deep models, and perform image generation/manipulation. Universal perturbations, being image agnostic, fingerprint the deep model itself. We show that they can be used to detect Trojaned models. In the last part of my talk, I will present our work on transferring adversarial attacks to the physical world, simulated using graphics. I will discuss attacks on action recognition where the perturbations are computed on human skeletons and then transferred to videos. Finally, I will present our work on 3D adversarial textures computed using neural rendering to fool models in a pure black-box setting where the target model and training data are both unknown. I will conclude my talk with some interesting insights into adversarial machine learning.