Spurious correlations, Invariance, and Stability (SCIS)

Aahlad Puli · Maggie Makar · Victor Veitch · Yoav Wald · Mark Goldstein · Limor Gultchin · Angela Zhou · Uri Shalit · Suchi Saria

Room 340 - 342


Machine learning models often break when deployed in the wild, despite excellent performance on benchmarks. In particular, models can learn to rely on apparently unnatural or irrelevant features. For instance, 1) in detecting lung disease from chest X-rays, models rely on the type of scanner rather than physiological signals, 2) in natural language inference, models rely on the number of shared words rather than the subject’s relationship with the object, 3) in precision medicine, polygenic risk scores for diseases like breast cancer rely on genes prevalent mainly in European populations, and predict poorly in other populations. In examples like these and others, the undesirable behavior stems from the model exploiting a spurious correlation. Improper treatment of spurious correlations can discourage the use of ML in the real world and lead to catastrophic consequences in extreme cases. The recent surge of interest in this issue is accordingly welcome and timely: more than 50 closely related papers have been published just in ICML 2021, NeurIPS 2021, and ICLR 2022. However, the most fundamental questions remain unanswered— e.g., how should the notion of spurious correlations be made precise? How should one evaluate models in the presence of spurious correlations? In which situations can a given method be expected to work, or fail? Which notions of invariance are fruitful and tractable? Further, relevant work has sprung up ad hoc from several distinct communities, with limited interplay between them: invariance and independence-constrained learning in causality-inspired ML, methods to decorrelate predictions and protected features (e.g. race) in algorithmic fairness, and stress testing procedures to discover unexpected model dependencies in reliable ML. This workshop will bring together these different communities to make progress on common foundational problems, and facilitate their interaction with domain-experts to build impactful collaborations.

Chat is not available.
Timezone: America/Los_Angeles »