Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Spurious correlations, Invariance, and Stability (SCIS)

SelecMix: Debiased Learning by Mixing up Contradicting Pairs

Inwoo Hwang · Sangjun Lee · Yunhyeok Kwak · Seong Joon Oh · Damien Teney · Jin-Hwa Kim · Byoung-Tak Zhang

Keywords: [ mixup ] [ debias ] [ spurious correlation ]


Abstract:

Neural networks trained with ERM (empirical risk minimization) sometimes learn unintended decision rules, in particular when their training data is biased, i.e., when training labels are correlated with undesirable features. Techniques have been proposed to prevent a network from learning such features, using the heuristic that spurious correlations are ``too simple'' and learned preferentially during training by SGD. Recent methods resample or augment training data such that examples displaying spurious correlations (a.k.a. bias-aligned examples) become a minority, whereas the other, bias-conflicting examples become prevalent. These approaches are difficult to train and scale to real-world data e.g. because they rely on disentangled representations. We propose an alternative based on mixup that augments the available bias-conflicting training data with convex combinations of existing examples and their labels. Our method, named SelecMix, applies mixup to selected pairs of examples, which show either (i) the same label but dissimilar biased features, or (ii) a different label but similar biased features. To comparing examples along biased features, we use an auxiliary model relying on the heuristic that biased features are learned preferentially during training by SGD. On semi-synthetic benchmarks where this heuristic is valid, we obtain results superior to existing methods, in particular in the presence of label noise, which complicates the identification of bias-conflicting examples.

Chat is not available.