Timezone: »
Knowledge distillation deals with the problem of training a smaller model (\emph{Student}) from a high capacity source model (\emph{Teacher}) so as to retain most of its performance. Existing approaches use either the training data or meta-data extracted from it in order to train the \emph{Student}. However, accessing the dataset on which the \emph{Teacher} has been trained may not always be feasible if the dataset is very large or it poses privacy or safety concerns (e.g., bio-metric or medical data). Hence, in this paper, we propose a novel data-free method to train the \emph{Student} from the \emph{Teacher}. Without even using any meta-data, we synthesize the \emph{Data Impressions} from the complex \emph{Teacher} model and utilize these as surrogates for the original training data samples to transfer its learning to \emph{Student} via knowledge distillation. We, therefore, dub our method ``Zero-Shot Knowledge Distillation" and demonstrate that our framework results in competitive generalization performance as achieved by distillation using the actual training data samples on multiple benchmark datasets.
Author Information
Gaurav Kumar Nayak (Indian Institute of Science)
Konda Reddy Mopuri (University of Edinburgh)
Vaisakh Shaj (University Of Lincoln)
I currently pursuing my PhD in Robot Learning under Prof Gerhard Neumann at the CLAS Lab, University Of Lincoln, UK. Before that I was a Data Scientist at the cybersecurity firm McAfee(Intel Security). Previously I worked with Intel for 2 years. I hold a post graduate degree in Machine Learning and Computing from the Indian Institute of Space Science and Technology.
Venkatesh Babu Radhakrishnan (Indian Institute of Science)
Anirban Chakraborty (Indian Institute of Science)
Related Events (a corresponding poster, oral, or spotlight)
-
2019 Oral: Zero-Shot Knowledge Distillation in Deep Networks »
Thu. Jun 13th 06:30 -- 06:35 PM Room Grand Ballroom
More from the Same Authors
-
2021 : Towards Achieving Adversarial Robustness Beyond Perceptual Limits »
Sravanti Addepalli · Samyak Jain · Gaurang Sriramanan · Shivangi Khare · Venkatesh Babu Radhakrishnan -
2022 : Efficient and Effective Augmentation Strategy for Adversarial Training »
Sravanti Addepalli · Samyak Jain · Venkatesh Babu Radhakrishnan -
2023 : SelMix: Selective Mixup Fine Tuning for Optimizing Non-Decomposable Metrics »
shrinivas ramasubramanian · Harsh Rangwani · Sho Takemori · Kunal Samanta · Yuhei Umeda · Venkatesh Babu Radhakrishnan -
2022 Poster: A Closer Look at Smoothness in Domain Adversarial Training »
Harsh Rangwani · Sumukh K Aithal · Mayank Mishra · Arihant Jain · Venkatesh Babu Radhakrishnan -
2022 Poster: Balancing Discriminability and Transferability for Source-Free Domain Adaptation »
Jogendra Nath Kundu · Akshay Kulkarni · Suvaansh Bhambri · Deepesh Mehta · Shreyas Kulkarni · Varun Jampani · Venkatesh Babu Radhakrishnan -
2022 Spotlight: Balancing Discriminability and Transferability for Source-Free Domain Adaptation »
Jogendra Nath Kundu · Akshay Kulkarni · Suvaansh Bhambri · Deepesh Mehta · Shreyas Kulkarni · Varun Jampani · Venkatesh Babu Radhakrishnan -
2022 Spotlight: A Closer Look at Smoothness in Domain Adversarial Training »
Harsh Rangwani · Sumukh K Aithal · Mayank Mishra · Arihant Jain · Venkatesh Babu Radhakrishnan