Timezone: »

How to Steer Your Adversary: Targeted and Efficient Model Stealing Defenses with Gradient Redirection
Mantas Mazeika · Bo Li · David Forsyth

Wed Jul 20 01:50 PM -- 01:55 PM (PDT) @ Room 307

Model stealing attacks present a dilemma for public machine learning APIs. To protect financial investments, companies may be forced to withhold important information about their models that could facilitate theft, including uncertainty estimates and prediction explanations. This compromise is harmful not only to users but also to external transparency. Model stealing defenses seek to resolve this dilemma by making models harder to steal while preserving utility for benign users. However, existing defenses have poor performance in practice, either requiring enormous computational overheads or severe utility trade-offs. To meet these challenges, we present a new approach to model stealing defenses called gradient redirection. At the core of our approach is a provably optimal, efficient algorithm for steering an adversary's training updates in a targeted manner. Combined with improvements to surrogate networks and a novel coordinated defense strategy, our gradient redirection defense, called GRAD^2, achieves small utility trade-offs and low computational overhead, outperforming the best prior defenses. Moreover, we demonstrate how gradient redirection enables reprogramming the adversary with arbitrary behavior, which we hope will foster work on new avenues of defense.

Author Information

Mantas Mazeika (UIUC)
Bo Li (UIUC)
Bo Li

Dr. Bo Li is an assistant professor in the Department of Computer Science at the University of Illinois at Urbana–Champaign. She is the recipient of the IJCAI Computers and Thought Award, Alfred P. Sloan Research Fellowship, AI’s 10 to Watch, NSF CAREER Award, MIT Technology Review TR-35 Award, Dean's Award for Excellence in Research, C.W. Gear Outstanding Junior Faculty Award, Intel Rising Star award, Symantec Research Labs Fellowship, Rising Star Award, Research Awards from Tech companies such as Amazon, Facebook, Intel, IBM, and eBay, and best paper awards at several top machine learning and security conferences. Her research focuses on both theoretical and practical aspects of trustworthy machine learning, which is at the intersection of machine learning, security, privacy, and game theory. She has designed several scalable frameworks for trustworthy machine learning and privacy-preserving data publishing. Her work has been featured by major publications and media outlets such as Nature, Wired, Fortune, and New York Times.

David Forsyth (Univeristy of Illinois at Urbana-Champaign)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors