Timezone: »
Where Does My Model Underperform?: A Human Evaluation of Slice Discovery Algorithms
Nari Johnson · Ángel Alexander Cabrera · Gregory Plumb · Ameet Talwalkar
Event URL: https://openreview.net/forum?id=HnyYGxRliS »
A growing number of works propose tools to help stakeholders form hypotheses about the behavior of machine learning models. We focus our study on slice discovery algorithms: automated methods that aim to group together coherent and high-error "slices" (i.e. subsets) of data. While these tools purport to help users identify where (on which subgroups) their model underperforms, there has been little evaluation of whether they help users achieve their proposed goals. We run a controlled user study $(N = 15)$ to evaluate if the slices output by two existing slice discovery algorithms help users form correct hypotheses about an image classification model. Our results provide positive evidence that existing tools provide benefit relative to a naive baseline, and challenge dominant assumptions shared by past work.
A growing number of works propose tools to help stakeholders form hypotheses about the behavior of machine learning models. We focus our study on slice discovery algorithms: automated methods that aim to group together coherent and high-error "slices" (i.e. subsets) of data. While these tools purport to help users identify where (on which subgroups) their model underperforms, there has been little evaluation of whether they help users achieve their proposed goals. We run a controlled user study $(N = 15)$ to evaluate if the slices output by two existing slice discovery algorithms help users form correct hypotheses about an image classification model. Our results provide positive evidence that existing tools provide benefit relative to a naive baseline, and challenge dominant assumptions shared by past work.
Author Information
Nari Johnson (Carnegie Mellon University)
Ángel Alexander Cabrera (Carnegie Mellon University)
Gregory Plumb (Carnegie Mellon University)
Ameet Talwalkar (Carnegie Mellon University)
More from the Same Authors
-
2021 : Interpretable Machine Learning: Moving From Mythos to Diagnostics »
Valerie Chen · Jeffrey Li · Joon Kim · Gregory Plumb · Ameet Talwalkar -
2022 : SimpleSpot and Evaluating Systemic Errors using Synthetic Image Datasets »
Gregory Plumb · Nari Johnson · Ángel Alexander Cabrera · Marco Ribeiro · Ameet Talwalkar -
2022 : Perspectives on Incorporating Expert Feedback into Model Updates »
Valerie Chen · Umang Bhatt · Hoda Heidari · Adrian Weller · Ameet Talwalkar -
2023 : Paper Spotlights »
Andrew Ilyas · Alizée Pace · Ji Won Park · Adam Breitholtz · Nari Johnson -
2023 Oral: Cross-Modal Fine-Tuning: Align then Refine »
Junhong Shen · Liam Li · Lucio Dery · Corey Staten · Mikhail Khodak · Graham Neubig · Ameet Talwalkar -
2023 Poster: Cross-Modal Fine-Tuning: Align then Refine »
Junhong Shen · Liam Li · Lucio Dery · Corey Staten · Mikhail Khodak · Graham Neubig · Ameet Talwalkar -
2022 Poster: Sanity Simulations for Saliency Methods »
Joon Kim · Gregory Plumb · Ameet Talwalkar -
2022 Spotlight: Sanity Simulations for Saliency Methods »
Joon Kim · Gregory Plumb · Ameet Talwalkar -
2021 : Federated Hyperparameter Tuning: Challenges, Baselines, and Connections to Weight-Sharing (Q&A) »
Ameet Talwalkar -
2021 : Federated Hyperparameter Tuning: Challenges, Baselines, and Connections to Weight-Sharing »
Ameet Talwalkar -
2020 Poster: FACT: A Diagnostic for Group Fairness Trade-offs »
Joon Kim · Jiahao Chen · Ameet Talwalkar -
2020 Poster: Explaining Groups of Points in Low-Dimensional Representations »
Gregory Plumb · Jonathan Terhorst · Sriram Sankararaman · Ameet Talwalkar -
2019 : ARUBA: Efficient and Adaptive Meta-Learning with Provable Guarantees (Ameet Talwalkar) »
Ameet Talwalkar -
2019 Workshop: Adaptive and Multitask Learning: Algorithms & Systems »
Maruan Al-Shedivat · Anthony Platanios · Otilia Stretcu · Jacob Andreas · Ameet Talwalkar · Rich Caruana · Tom Mitchell · Eric Xing -
2019 : Poster Session 1 (all papers) »
Matilde Gargiani · Yochai Zur · Chaim Baskin · Evgenii Zheltonozhskii · Liam Li · Ameet Talwalkar · Xuedong Shang · Harkirat Singh Behl · Atilim Gunes Baydin · Ivo Couckuyt · Tom Dhaene · Chieh Lin · Wei Wei · Min Sun · Orchid Majumder · Michele Donini · Yoshihiko Ozaki · Ryan P. Adams · Christian Geißler · Ping Luo · zhanglin peng · · Ruimao Zhang · John Langford · Rich Caruana · Debadeepta Dey · Charles Weill · Xavi Gonzalvo · Scott Yang · Scott Yak · Eugen Hotaj · Vladimir Macko · Mehryar Mohri · Corinna Cortes · Stefan Webb · Jonathan Chen · Martin Jankowiak · Noah Goodman · Aaron Klein · Frank Hutter · Mojan Javaheripi · Mohammad Samragh · Sungbin Lim · Taesup Kim · SUNGWOONG KIM · Michael Volpp · Iddo Drori · Yamuna Krishnamurthy · Kyunghyun Cho · Stanislaw Jastrzebski · Quentin de Laroussilhe · Mingxing Tan · Xiao Ma · Neil Houlsby · Andrea Gesmundo · Zalán Borsos · Krzysztof Maziarz · Felipe Petroski Such · Joel Lehman · Kenneth Stanley · Jeff Clune · Pieter Gijsbers · Joaquin Vanschoren · Felix Mohr · Eyke Hüllermeier · Zheng Xiong · Wenpeng Zhang · Wenwu Zhu · Weijia Shao · Aleksandra Faust · Michal Valko · Michael Y Li · Hugo Jair Escalante · Marcel Wever · Andrey Khorlin · Tara Javidi · Anthony Francis · Saurajit Mukherjee · Jungtaek Kim · Michael McCourt · Saehoon Kim · Tackgeun You · Seungjin Choi · Nicolas Knudde · Alexander Tornede · Ghassen Jerfel -
2019 Poster: Provable Guarantees for Gradient-Based Meta-Learning »
Nina Balcan · Mikhail Khodak · Ameet Talwalkar -
2019 Oral: Provable Guarantees for Gradient-Based Meta-Learning »
Nina Balcan · Mikhail Khodak · Ameet Talwalkar