Timezone: »
Ensembles of deep neural networks have demonstrated superior performance, but their heavy computational cost hinders applying them for resource-limited environments. It motivates distilling knowledge from the ensemble teacher into a smaller student network, and there are two important design choices for this ensemble distillation: 1) how to construct the student network, and 2) what data should be shown during training. In this paper, we propose a weight averaging technique where a student with multiple subnetworks is trained to absorb the functional diversity of ensemble teachers, but then those subnetworks are properly averaged for inference, giving a single student network with no additional inference cost. We also propose a perturbation strategy that seeks inputs from which the diversities of teachers can be better transferred to the student. Combining these two, our method significantly improves upon previous methods on various image classification tasks.
Author Information
Giung Nam (KAIST)
Hyungi Lee (KAIST)
Byeongho Heo (NAVER AI LAB)
Juho Lee (KAIST, AITRICS)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Spotlight: Improving Ensemble Distillation With Weight Averaging and Diversifying Perturbation »
Tue. Jul 19th 05:50 -- 05:55 PM Room Ballroom 1 & 2
More from the Same Authors
-
2023 : Function Space Bayesian Pseudocoreset for Bayesian Neural Networks »
Balhae Kim · Hyungi Lee · Juho Lee -
2023 : Early Exiting for Accelerated Inference in Diffusion Models »
Taehong Moon · Moonseok Choi · EungGu Yun · Jongmin Yoon · Gayoung Lee · Juho Lee -
2023 : Towards Safe Self-Distillation of Internet-Scale Text-to-Image Diffusion Models »
Sanghyun Kim · Seohyeon Jung · Balhae Kim · Moonseok Choi · Jinwoo Shin · Juho Lee -
2023 Poster: Probabilistic Imputation for Time-series Classification with Missing Data »
SeungHyun Kim · Hyunsu Kim · EungGu Yun · Hwangrae Lee · Jaehun Lee · Juho Lee -
2023 Poster: Regularizing Towards Soft Equivariance Under Mixed Symmetries »
Hyunsu Kim · Hyungi Lee · Hongseok Yang · Juho Lee -
2023 Poster: Scalable Set Encoding with Universal Mini-Batch Consistency and Unbiased Full Set Gradient Approximation »
Jeffrey Willette · Seanie Lee · Bruno Andreis · Kenji Kawaguchi · Juho Lee · Sung Ju Hwang -
2023 Poster: Traversing Between Modes in Function Space for Fast Ensembling »
EungGu Yun · Hyungi Lee · Giung Nam · Juho Lee -
2022 Poster: Set Based Stochastic Subsampling »
Bruno Andreis · Seanie Lee · A. Tuan Nguyen · Juho Lee · Eunho Yang · Sung Ju Hwang -
2022 Spotlight: Set Based Stochastic Subsampling »
Bruno Andreis · Seanie Lee · A. Tuan Nguyen · Juho Lee · Eunho Yang · Sung Ju Hwang -
2021 Poster: Adversarial Purification with Score-based Generative Models »
Jongmin Yoon · Sung Ju Hwang · Juho Lee -
2021 Spotlight: Adversarial Purification with Score-based Generative Models »
Jongmin Yoon · Sung Ju Hwang · Juho Lee