Timezone: »
Understanding the performance of machine learning (ML) models across diverse data distributions is critically important for reliable applications. Despite recent empirical studies positing a near-perfect linear correlation between in-distribution (ID) and out-of-distribution (OOD) accuracies, we empirically demonstrate that this correlation is more nuanced under subpopulation shifts. Through rigorous experimentation and analysis across a variety of datasets, models, and training epochs, we demonstrate that OOD performance often has a nonlinear correlation with ID performance in subpopulation shifts. Our findings, which contrast previous studies that have posited a linear correlation in model performance during distribution shifts, reveal a "moon shape" correlation (parabolic uptrend curve) between the test performance on the majority subpopulation and the minority subpopulation. This non-trivial nonlinear correlation holds across model architectures, hyperparameters, training durations, and the imbalance between subpopulations. Furthermore, we found that the nonlinearity of this "moon shape" is causally influenced by the degree of spurious correlations in the training data. Our controlled experiments show that stronger spurious correlation in the training data creates more nonlinear performance correlation. We provide complementary experimental and theoretical analyses for this phenomenon, and discuss its implications for ML reliability and fairness. Our work highlights the importance of understanding the nonlinear effects of model improvement on performance in different subpopulations, and has the potential to inform the development of more equitable and responsible machine learning models.
Author Information
Weixin Liang (Stanford University)
Yining Mao (Stanford University)
Yongchan Kwon (Columbia University)
Xinyu Yang (Zhejiang University)
James Zou (Stanford)
More from the Same Authors
-
2021 : MetaDataset: A Dataset of Datasets for Evaluating Distribution Shifts and Training Conflicts »
Weixin Liang · James Zou · Weixin Liang -
2021 : Stateful Performative Gradient Descent »
Zachary Izzo · James Zou · Lexing Ying -
2022 : On the nonlinear correlation of ML performance across data subpopulations »
Weixin Liang · Yining Mao · Yongchan Kwon · Xinyu Yang · James Zou -
2022 : MetaShift: A Dataset of Datasets for Evaluating Contextual Distribution Shifts »
Weixin Liang · Xinyu Yang · James Zou -
2022 : On the nonlinear correlation of ML performance across data subpopulations »
Weixin Liang · Yining Mao · Yongchan Kwon · Xinyu Yang · James Zou -
2022 : Mind the Gap: Understanding the Modality Gap in Multi-modal Contrastive Representation Learning »
Weixin Liang · Yuhui Zhang · Yongchan Kwon · Serena Yeung · James Zou -
2023 : Last-Layer Fairness Fine-tuning is Simple and Effective for Neural Networks »
Yuzhen Mao · Zhun Deng · Huaxiu Yao · Ting Ye · Kenji Kawaguchi · James Zou -
2023 : Data-OOB: Out-of-bag Estimate as a Simple and Efficient Data Value »
Yongchan Kwon · James Zou -
2023 : Prospectors: Leveraging Short Contexts to Mine Salient Objects in High-dimensional Imagery »
Gautam Machiraju · Arjun Desai · James Zou · Christopher Re · Parag Mallick -
2023 : Beyond Confidence: Reliable Models Should Also Consider Atypicality »
Mert Yuksekgonul · Linjun Zhang · James Zou · Carlos Guestrin -
2023 : Less is More: Using Multiple LLMs for Applications with Lower Costs »
Lingjiao Chen · Matei Zaharia · James Zou -
2023 Poster: Data-Driven Subgroup Identification for Linear Regression »
Zachary Izzo · Ruishan Liu · James Zou -
2023 Poster: Data-OOB: Out-of-bag Estimate as a Simple and Efficient Data Value »
Yongchan Kwon · James Zou -
2023 Poster: Discover and Cure: Concept-aware Mitigation of Spurious Correlation »
Shirley Wu · Mert Yuksekgonul · Linjun Zhang · James Zou -
2022 : Invited talk #2 James Zou (Title: Machine learning to make clinical trials more efficient and diverse) »
James Zou -
2022 : GSCLIP : A Framework for Explaining Distribution Shifts in Natural Language »
Zhiying Zhu · Weixin Liang · James Zou -
2022 : 7-UP: generating in silico CODEX from a small set of immunofluorescence markers »
James Zou -
2022 : Contributed Talk 2: MetaShift: A Dataset of Datasets for Evaluating Contextual Distribution Shifts »
Weixin Liang · Xinyu Yang · James Zou -
2022 Poster: Improving Out-of-Distribution Robustness via Selective Augmentation »
Huaxiu Yao · Yu Wang · Sai Li · Linjun Zhang · Weixin Liang · James Zou · Chelsea Finn -
2022 Spotlight: Improving Out-of-Distribution Robustness via Selective Augmentation »
Huaxiu Yao · Yu Wang · Sai Li · Linjun Zhang · Weixin Liang · James Zou · Chelsea Finn -
2022 Poster: Meaningfully debugging model mistakes using conceptual counterfactual explanations »
Abubakar Abid · Mert Yuksekgonul · James Zou -
2022 Spotlight: Meaningfully debugging model mistakes using conceptual counterfactual explanations »
Abubakar Abid · Mert Yuksekgonul · James Zou -
2019 Poster: Adaptive Monte Carlo Multiple Testing via Multi-Armed Bandits »
Martin Zhang · James Zou · David Tse -
2019 Oral: Adaptive Monte Carlo Multiple Testing via Multi-Armed Bandits »
Martin Zhang · James Zou · David Tse -
2017 Poster: Estimating the unseen from multiple populations »
Aditi Raghunathan · Greg Valiant · James Zou -
2017 Poster: Learning Latent Space Models with Angular Constraints »
Pengtao Xie · Yuntian Deng · Yi Zhou · Abhimanu Kumar · Yaoliang Yu · James Zou · Eric Xing -
2017 Talk: Learning Latent Space Models with Angular Constraints »
Pengtao Xie · Yuntian Deng · Yi Zhou · Abhimanu Kumar · Yaoliang Yu · James Zou · Eric Xing -
2017 Talk: Estimating the unseen from multiple populations »
Aditi Raghunathan · Greg Valiant · James Zou