Timezone: »
Machine learning models often perform poorly on subgroups that are underrepresented in the training data. Yet, little is understood on the variation in mechanisms that cause subpopulation shifts, and how algorithms generalize across such diverse shifts at scale. In this work, we provide a fine-grained analysis of subpopulation shift. We first propose a unified framework that dissects and explains common shifts in subgroups. We then establish a comprehensive benchmark of 20 state-of-the-art algorithms evaluated on 12 real-world datasets in vision, language, and healthcare domains. With results obtained from training over 10,000 models, we reveal intriguing observations for future progress in this space. First, existing algorithms only improve subgroup robustness over certain types of shifts but not others. Moreover, while current algorithms rely on group-annotated validation data for model selection, we find that a simple selection criterion based on worst-class accuracy is surprisingly effective even without any group information. Finally, unlike existing works that solely aim to improve worst-group accuracy (WGA), we demonstrate the fundamental tradeoff between WGA and other important metrics, highlighting the need to carefully choose testing metrics. Code and data are available at: https://github.com/YyzHarry/SubpopBench.
Author Information
Yuzhe Yang (MIT)
Haoran Zhang (Massachusetts Institute of Technology)
Dina Katabi (MIT)
Marzyeh Ghassemi (MIT)

Dr. Marzyeh Ghassemi is an Assistant Professor at MIT in Electrical Engineering and Computer Science (EECS) and Institute for Medical Engineering & Science (IMES), and a Vector Institute faculty member holding a Canadian CIFAR AI Chair and Canada Research Chair. She holds MIT affiliations with the Jameel Clinic and CSAIL. Professor Ghassemi holds a Herman L. F. von Helmholtz Career Development Professorship, and was named a CIFAR Azrieli Global Scholar and one of MIT Tech Review’s 35 Innovators Under 35. Previously, she was a Visiting Researcher with Alphabet’s Verily. She is currently on leave from the University of Toronto Departments of Computer Science and Medicine. Prior to her PhD in Computer Science at MIT, she received an MSc. degree in biomedical engineering from Oxford University as a Marshall Scholar, and B.S. degrees in computer science and electrical engineering as a Goldwater Scholar at New Mexico State University.
More from the Same Authors
-
2022 : Evaluating and Improving Robustness of Self-Supervised Representations to Spurious Correlations »
Kimia Hamidieh · Haoran Zhang · Marzyeh Ghassemi -
2022 : "Why did the Model Fail?": Attributing Model Performance Changes to Distribution Shifts »
Haoran Zhang · Harvineet Singh · Shalmali Joshi -
2023 : Identifying Implicit Social Biases in Vision-Language Models »
Kimia Hamidieh · Haoran Zhang · Thomas Hartvigsen · Marzyeh Ghassemi -
2023 : A Pipeline for Interpretable Clinical Subtyping with Deep Metric Learning »
Haoran Zhang · Qixuan Jin · Thomas Hartvigsen · Miriam Udler · Marzyeh Ghassemi -
2023 : A Pipeline for Interpretable Clinical Subtyping with Deep Metric Learning »
Haoran Zhang · Qixuan Jin · Thomas Hartvigsen · Miriam Udler · Marzyeh Ghassemi -
2023 : Continuous Time Evidential Distributions for Irregular Time Series »
Taylor Killian · Haoran Zhang · Thomas Hartvigsen · Ava Amini -
2023 : Identifying Implicit Social Biases in Vision-Language Models »
Kimia Hamidieh · Haoran Zhang · Thomas Hartvigsen · Marzyeh Ghassemi -
2023 : The pulse of ethical machine learning and health »
Marzyeh Ghassemi -
2023 Workshop: 3rd Workshop on Interpretable Machine Learning in Healthcare (IMLH) »
Weina Jin · Ramin Zabih · S. Kevin Zhou · Yuyin Zhou · Xiaoxiao Li · Yifan Peng · Zongwei Zhou · Yucheng Tang · Yuzhe Yang · Agni Kumar -
2023 Poster: When Personalization Harms Performance: Reconsidering the Use of Group Attributes in Prediction »
Vinith Suriyakumar · Marzyeh Ghassemi · Berk Ustun -
2023 Oral: When Personalization Harms Performance: Reconsidering the Use of Group Attributes in Prediction »
Vinith Suriyakumar · Marzyeh Ghassemi · Berk Ustun -
2023 Poster: "Why did the Model Fail?": Attributing Model Performance Changes to Distribution Shifts »
Haoran Zhang · Harvineet Singh · Marzyeh Ghassemi · Shalmali Joshi -
2023 Invited Talk: Taking the Pulse Of Ethical ML in Health »
Marzyeh Ghassemi -
2022 : Invited talks 2 Q/A, Christina and Marzyeh »
Christina Heinze-Deml · Marzyeh Ghassemi -
2022 : Invited talks 2, Christina Heinze-Deml and Marzyeh Ghassemi »
Christina Heinze-Deml · Marzyeh Ghassemi -
2021 Poster: Delving into Deep Imbalanced Regression »
Yuzhe Yang · Kaiwen Zha · YINGCONG CHEN · Hao Wang · Dina Katabi -
2021 Oral: Delving into Deep Imbalanced Regression »
Yuzhe Yang · Kaiwen Zha · YINGCONG CHEN · Hao Wang · Dina Katabi -
2021 Poster: Simultaneous Similarity-based Self-Distillation for Deep Metric Learning »
Karsten Roth · Timo Milbich · Bjorn Ommer · Joseph Paul Cohen · Marzyeh Ghassemi -
2021 Spotlight: Simultaneous Similarity-based Self-Distillation for Deep Metric Learning »
Karsten Roth · Timo Milbich · Bjorn Ommer · Joseph Paul Cohen · Marzyeh Ghassemi -
2020 Workshop: Healthcare Systems, Population Health, and the Role of Health-tech »
Creighton Heaukulani · Konstantina Palla · Katherine Heller · Niranjani Prasad · Marzyeh Ghassemi -
2020 Poster: Continuously Indexed Domain Adaptation »
Hao Wang · Hao He · Dina Katabi -
2019 Poster: ME-Net: Towards Effective Adversarial Robustness with Matrix Estimation »
Yuzhe Yang · GUO ZHANG · Zhi Xu · Dina Katabi -
2019 Oral: ME-Net: Towards Effective Adversarial Robustness with Matrix Estimation »
Yuzhe Yang · GUO ZHANG · Zhi Xu · Dina Katabi -
2019 Poster: Circuit-GNN: Graph Neural Networks for Distributed Circuit Design »
GUO ZHANG · Hao He · Dina Katabi -
2019 Oral: Circuit-GNN: Graph Neural Networks for Distributed Circuit Design »
GUO ZHANG · Hao He · Dina Katabi -
2017 Poster: Learning Sleep Stages from Radio Signals: A Conditional Adversarial Architecture »
Mingmin Zhao · Shichao Yue · Dina Katabi · Tommi Jaakkola · Matt Bianchi -
2017 Talk: Learning Sleep Stages from Radio Signals: A Conditional Adversarial Architecture »
Mingmin Zhao · Shichao Yue · Dina Katabi · Tommi Jaakkola · Matt Bianchi