Timezone: »
The teacher-student (TS) framework, training a (student) network by utilizing an auxiliary superior (teacher) network, has been adopted as a popular training paradigm in many machine learning schemes, since the seminal work---Knowledge distillation (KD) for model compression and transfer learning. Many recent self-supervised learning (SSL) schemes also adopt the TS framework, where teacher networks are maintained as the moving average of student networks, called the momentum networks. This paper presents TSPipe, a pipelined approach to accelerate the training process of any TS frameworks including KD and SSL. Under the observation that the teacher network does not need a backward pass, our main idea is to schedule the computation of the teacher and student network separately, and fully utilize the GPU during training by interleaving the computations of the two networks and relaxing their dependencies. In case the teacher network requires a momentum update, we use delayed parameter updates only on the teacher network to attain high model accuracy. Compared to existing pipeline parallelism schemes, which sacrifice either training throughput or model accuracy, TSPipe provides better performance trade-offs, achieving up to 12.15x higher throughput.
Author Information
Hwijoon Lim (KAIST)
Yechan Kim (KAIST)
Sukmin Yun (KAIST)
Jinwoo Shin (KAIST)
Dongsu Han (KAIST)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Poster: TSPipe: Learn from Teacher Faster with Pipelines »
Wed. Jul 20th through Thu the 21st Room Hall E #722
More from the Same Authors
-
2021 : SmoothMix: Training Confidence-calibrated Smoothed Classifiers for Certified Adversarial Robustness »
Jongheon Jeong · Sejun Park · Minkyu Kim · Heung-Chang Lee · Doguk Kim · Jinwoo Shin -
2021 : Entropy Weighted Adversarial Training »
Minseon Kim · Jihoon Tack · Jinwoo Shin · Sung Ju Hwang -
2021 : Consistency Regularization for Adversarial Robustness »
Jihoon Tack · Sihyun Yu · Jongheon Jeong · Minseon Kim · Sung Ju Hwang · Jinwoo Shin -
2023 : Few-shot Anomaly Detection via Personalization »
Sangkyung Kwak · Jongheon Jeong · Hankook Lee · Woohyuck Kim · Jinwoo Shin -
2023 : Bias-to-Text: Debiasing Unknown Visual Biases by Language Interpretation »
Younghyun Kim · Sangwoo Mo · Minkyu Kim · Kyungmin Lee · Jaeho Lee · Jinwoo Shin -
2023 : Breaking the Spurious Causality of Conditional Generation via Fairness Intervention with Corrective Sampling »
Jun Hyun Nam · Sangwoo Mo · Jaeho Lee · Jinwoo Shin -
2023 : Guide Your Agent with Adaptive Multimodal Rewards »
Changyeon Kim · Younggyo Seo · Hao Liu · Lisa Lee · Jinwoo Shin · Honglak Lee · Kimin Lee -
2023 : Collaborative Score Distillation for Consistent Visual Synthesis »
Subin Kim · Kyungmin Lee · June Suk Choi · Jongheon Jeong · Kihyuk Sohn · Jinwoo Shin -
2023 : Semi-supervised Tabular Classification via In-context Learning of Large Language Models »
Jaehyun Nam · Woomin Song · Seong Hyeon Park · Jihoon Tack · Sukmin Yun · Jaehyung Kim · Jinwoo Shin -
2023 : Towards Safe Self-Distillation of Internet-Scale Text-to-Image Diffusion Models »
Sanghyun Kim · Seohyeon Jung · Balhae Kim · Moonseok Choi · Jinwoo Shin · Juho Lee -
2023 Poster: Prefer to Classify: Improving Text Classifiers via Auxiliary Preference Learning »
Jaehyung Kim · Jinwoo Shin · Dongyeop Kang -
2023 Poster: Modality-Agnostic Variational Compression of Implicit Neural Representations »
Jonathan Richard Schwarz · Jihoon Tack · Yee-Whye Teh · Jaeho Lee · Jinwoo Shin -
2023 Poster: Multi-View Masked World Models for Visual Robotic Manipulation »
Younggyo Seo · Junsu Kim · Stephen James · Kimin Lee · Jinwoo Shin · Pieter Abbeel -
2022 Poster: Disentangling Sources of Risk for Distributional Multi-Agent Reinforcement Learning »
Kyunghwan Son · Junsu Kim · Sungsoo Ahn · Roben Delos Reyes · Yung Yi · Jinwoo Shin -
2022 Poster: Time Is MattEr: Temporal Self-supervision for Video Transformers »
Sukmin Yun · Jaehyung Kim · Dongyoon Han · Hwanjun Song · Jung-Woo Ha · Jinwoo Shin -
2022 Spotlight: Disentangling Sources of Risk for Distributional Multi-Agent Reinforcement Learning »
Kyunghwan Son · Junsu Kim · Sungsoo Ahn · Roben Delos Reyes · Yung Yi · Jinwoo Shin -
2022 Spotlight: Time Is MattEr: Temporal Self-supervision for Video Transformers »
Sukmin Yun · Jaehyung Kim · Dongyoon Han · Hwanjun Song · Jung-Woo Ha · Jinwoo Shin -
2021 : Contrastive Learning for Novelty Detection »
Jinwoo Shin -
2021 Poster: Self-Improved Retrosynthetic Planning »
Junsu Kim · Sungsoo Ahn · Hankook Lee · Jinwoo Shin -
2021 Spotlight: Self-Improved Retrosynthetic Planning »
Junsu Kim · Sungsoo Ahn · Hankook Lee · Jinwoo Shin -
2021 Poster: Learning to Generate Noise for Multi-Attack Robustness »
Divyam Madaan · Jinwoo Shin · Sung Ju Hwang -
2021 Spotlight: Learning to Generate Noise for Multi-Attack Robustness »
Divyam Madaan · Jinwoo Shin · Sung Ju Hwang -
2021 Poster: State Entropy Maximization with Random Encoders for Efficient Exploration »
Younggyo Seo · Lili Chen · Jinwoo Shin · Honglak Lee · Pieter Abbeel · Kimin Lee -
2021 Spotlight: State Entropy Maximization with Random Encoders for Efficient Exploration »
Younggyo Seo · Lili Chen · Jinwoo Shin · Honglak Lee · Pieter Abbeel · Kimin Lee -
2020 Poster: Self-supervised Label Augmentation via Input Transformations »
Hankook Lee · Sung Ju Hwang · Jinwoo Shin -
2020 Poster: Context-aware Dynamics Model for Generalization in Model-Based Reinforcement Learning »
Kimin Lee · Younggyo Seo · Seunghyun Lee · Honglak Lee · Jinwoo Shin -
2020 Poster: Polynomial Tensor Sketch for Element-wise Function of Low-Rank Matrix »
Insu Han · Haim Avron · Jinwoo Shin -
2020 Poster: Learning What to Defer for Maximum Independent Sets »
Sungsoo Ahn · Younggyo Seo · Jinwoo Shin -
2020 Poster: Adversarial Neural Pruning with Latent Vulnerability Suppression »
Divyam Madaan · Jinwoo Shin · Sung Ju Hwang -
2019 Poster: Spectral Approximate Inference »
Sejun Park · Eunho Yang · Se-Young Yun · Jinwoo Shin -
2019 Poster: Robust Inference via Generative Classifiers for Handling Noisy Labels »
Kimin Lee · Sukmin Yun · Kibok Lee · Honglak Lee · Bo Li · Jinwoo Shin -
2019 Poster: Learning What and Where to Transfer »
Yunhun Jang · Hankook Lee · Sung Ju Hwang · Jinwoo Shin -
2019 Oral: Spectral Approximate Inference »
Sejun Park · Eunho Yang · Se-Young Yun · Jinwoo Shin -
2019 Oral: Robust Inference via Generative Classifiers for Handling Noisy Labels »
Kimin Lee · Sukmin Yun · Kibok Lee · Honglak Lee · Bo Li · Jinwoo Shin -
2019 Oral: Learning What and Where to Transfer »
Yunhun Jang · Hankook Lee · Sung Ju Hwang · Jinwoo Shin -
2019 Poster: Training CNNs with Selective Allocation of Channels »
Jongheon Jeong · Jinwoo Shin -
2019 Oral: Training CNNs with Selective Allocation of Channels »
Jongheon Jeong · Jinwoo Shin -
2018 Poster: Bucket Renormalization for Approximate Inference »
Sungsoo Ahn · Michael Chertkov · Adrian Weller · Jinwoo Shin -
2018 Oral: Bucket Renormalization for Approximate Inference »
Sungsoo Ahn · Michael Chertkov · Adrian Weller · Jinwoo Shin -
2017 Poster: Faster Greedy MAP Inference for Determinantal Point Processes »
Insu Han · Prabhanjan Kambadur · Kyoungsoo Park · Jinwoo Shin -
2017 Poster: Confident Multiple Choice Learning »
Kimin Lee · Changho Hwang · KyoungSoo Park · Jinwoo Shin -
2017 Talk: Confident Multiple Choice Learning »
Kimin Lee · Changho Hwang · KyoungSoo Park · Jinwoo Shin -
2017 Talk: Faster Greedy MAP Inference for Determinantal Point Processes »
Insu Han · Prabhanjan Kambadur · Kyoungsoo Park · Jinwoo Shin