Timezone: »

Benchmarking Differential Privacy and Federated Learning for BERT Models
Priyam Basu · Rakshit Naidu · Zumrut Muftuoglu · Sahib Singh · FatemehSadat Mireshghallah

Depression is a serious medical illness that can have adverse effects on how one feels, thinks, and acts, which can lead to emotional and physical problems. Natural Language Processing (NLP) techniques can be applied to help with the diagnosis of such illnesses, using written peoples' utterances and writings. Due to the sensitive nature of such data, privacy measures need to be taken for handling and training models. In this work, we study the effects that Differential Privacy (DP) and Federated Learning (FL) have, on training contextualized language models (BERT, ALBERT, RoBERTa and DistilBERT), and offer insights on how to privately train NLP models. We envisage this work to be used in the healthcare/mental health industry to keep medical history private. Hence, we provide the open-source implementation of this work. To see the behavior of privacy implementations on the different datasets, the work is also implemented on a Sexual Harassment Twitter dataset.

Author Information

Priyam Basu (Manipal Institute of Technology)
Rakshit Naidu (Carnegie Mellon University)
Zumrut Muftuoglu (Yildiz Technical University)
Sahib Singh (OpenMined; Ford R&A)
FatemehSadat Mireshghallah (University of California San Diego)

More from the Same Authors