Poster
in
Workshop: The Second Workshop on Spurious Correlations, Invariance and Stability
Saving a Split for Last-layer Retraining can Improve Group Robustness without Group Annotations
Tyler LaBonte · Vidya Muthukumar · Abhishek Kumar
Empirical risk minimization (ERM) of neural networks is prone to over-reliance on spurious correlations and poor generalization on minority groups. The recent deep feature reweighting technique achieves state-of-the-art group robustness via simple last-layer retraining, but it requires held-out group annotations to construct a group-balanced reweighting dataset. We examine this impractical requirement and find that last-layer retraining can be surprisingly effective without group annotations; in some cases, a significant gain is solely due to class balancing. Moreover, we show that instead of using the entire training dataset for ERM, dependence on spurious correlations can be reduced by holding out a small split of the training dataset for class-balanced last-layer retraining. Our experiments on four benchmarks across vision and language tasks indicate that this method improves worst-group accuracy by up to 17% over class-balanced ERM on the original dataset despite using no additional data or annotations – a surprising and unexplained result given that the two splits have equally drastic group imbalance.