Poster
in
Workshop: The Second Workshop on Spurious Correlations, Invariance and Stability
Regularizing Model Gradients with Concepts to Improve Robustness to Spurious Correlations
Yiwei Yang · Anthony Liu · Robert Wolfe · Aylin Caliskan · Bill Howe
Deep Neural Networks are prone to capturing correlations between spurious attributes and class labels, leading to low accuracy on some groups of the data. Existing methods rely on group labels either during training or validation to improve the model’s robustness to spurious correlation. We observe that if a model correlates a spurious at- tribute with the target class, then the model is sensitive to the spurious attribute. In a pure vision setting, attribute labels representing bias may not be available. We propose Concept Regularization (CReg), a method that penalizes a model’s sensitivity to a concept represented as a set of curated images drawn from any external source: image generation models or web search. Our method does not require group labels on a dataset level, instead relying on a small amount of auxiliary data, potentially irrelevant to the classification task, to represent the protected attribute. We show across datasets that CReg outperforms the standard empirical risk minimization (ERM).