Abstract:
In class-imbalanced learning, the scarcity of information about minority classes presents challenges in obtaining generalizable features for these classes. Leveraging large-scale pre-trained models with powerful generalization capabilities as teacher models can help fill this information gap. Traditional knowledge distillation transfers the label distribution $p(\boldsymbol{y}|\boldsymbol{x})$ predicted by the teacher model to the student model. However, this method falls short on imbalanced data as it fails to capture the class-conditional probability distribution $p(\boldsymbol{x}|\boldsymbol{y})$ from the teacher model, which is crucial for enhancing generalization. To overcome this, we propose Class-Conditional Knowledge Distillation (CCKD), a novel approach that enables learning of the teacher model’s class-conditional probability distribution during the distillation process. Additionally, we introduce Augmented CCKD (ACCKD), which involves distillation on a constructed class-balanced dataset (formed through data mixing) and feature imitation on the entire dataset to further facilitate the learning of features. Experimental results on various imbalanced datasets demonstrate an average accuracy improvement of 7.4% using our method.
Chat is not available.