Timezone: »

InstaHide: Instance-hiding Schemes for Private Distributed Learning
Yangsibo Huang · Zhao Song · Kai Li · Sanjeev Arora

Wed Jul 15 05:00 AM -- 05:45 AM & Wed Jul 15 04:00 PM -- 04:45 PM (PDT) @ None #None

How can multiple distributed entities train a shared deep net on their private data while protecting data privacy? This paper introduces InstaHide, a simple encryption of training images. Encrypted images can be used in standard deep learning pipelines (PyTorch, Federated Learning etc.) with no additional setup or infrastructure. The encryption has a minor effect on test accuracy (unlike differential privacy).

Encryption consists of mixing the image with a set of other images (in the sense of Mixup data augmentation technique (Zhang et al., 2018)) followed by applying a random pixel-wise mask on the mixed image. Other contributions of this paper are: (a) Use of large public dataset of images (e.g. ImageNet) for mixing during encryption; this improves security. (b) Experiments demonstrating effectiveness in protecting privacy against known attacks while preserving model accuracy. (c) Theoretical analysis showing that successfully attacking privacy requires attackers to solve a difficult computational problem. (d) Demonstration that Mixup alone is insecure as (contrary to recent proposals), by showing some efficient attacks. (e) Release of a challenge dataset to allow design of new attacks.

Author Information

Yangsibo Huang (Princeton University)
Zhao Song (IAS/Princeton)
Kai Li (Princeton University)
Sanjeev Arora (Princeton University and Institute for Advanced Study)

More from the Same Authors