Class-Conditional Distribution Balancing for Group Robust Classification
Abstract
Spurious correlations that lead models to correct predictions for the wrong reasons pose a critical challenge for robust real-world generalization. Existing research attributes this issue to group imbalance and addresses it by maximizing group-balanced or worst-group accuracy, which heavily relies on expensive bias annotations. A compromise approach involves predicting bias information using extensively pretrained foundation models, which requires large-scale data and is limited to physically interpretable biases. To address these challenges, we offer a novel perspective by reframing the spurious correlations as imbalances/mismatches in class-conditional distributions caused by general biases, whether interpretable or not, and propose a simple yet effective robust learning method that eliminates the need for bias annotations or predictions. With the goal of maximizing the conditional entropy (uncertainty) of the label given spurious factors, our method leverages a sample reweighting strategy to achieve class-conditional distribution balancing, which automatically highlights minority groups and classes, effectively dismantling spurious correlations and producing a debiased data distribution for classification. Extensive experiments and analysis demonstrate that our approach consistently delivers state-of-the-art performance, rivaling methods that rely on bias supervision.