Timezone: »
Prediction models perform poorly when deployed to distributions different from those seen during training. To understand these operational failure modes of ML models, we develop methods to attribute the drop in performance to different types of distribution shifts. Our approach decomposes the performance drop into 1) an increase in harder but frequently seen examples during training, 2) changes in the relationship between outcome and features, and 3) poor performance on examples infrequent or unseen during training. Our procedure is principled yet flexible enough to incorporate any feature mapping or metadata. We empirically demonstrate how our decomposition can inform different ways to improve model performance for different distribution shifts.
Author Information
Tianhui Cai (Columbia University)
Hongseok Namkoong (Columbia University)
Steve Yadlowsky (Google Brain)
More from the Same Authors
-
2023 Poster: Underspecification Presents Challenges for Credibility in Modern Machine Learning »
Alexander D'Amour · Katherine Heller · Dan Moldovan · Ben Adlam · Babak Alipanahi · Alex Beutel · Christina Chen · Jonathan Deaton · Jacob Eisenstein · Matthew Hoffman · Farhad Hormozdiari · Neil Houlsby · Shaobo Hou · Ghassen Jerfel · Alan Karthikesalingam · Mario Lucic · Yian Ma · Cory McLean · Diana Mincu · Akinori Mitani · Andrea Montanari · Zachary Nado · Vivek Natarajan · Christopher Nielson · Thomas F. Osborne · Rajiv Raman · Kim Ramasamy · Rory sayres · Jessica Schrouff · Martin Seneviratne · Shannon Sequeira · Harini Suresh · Victor Veitch · Maksym Vladymyrov · Xuezhi Wang · Kellie Webster · Steve Yadlowsky · Taedong Yun · Xiaohua Zhai · D. Sculley -
2022 Poster: Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time »
Mitchell Wortsman · Gabriel Ilharco · Samir Gadre · Rebecca Roelofs · Raphael Gontijo Lopes · Ari Morcos · Hongseok Namkoong · Ali Farhadi · Yair Carmon · Simon Kornblith · Ludwig Schmidt -
2022 Spotlight: Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time »
Mitchell Wortsman · Gabriel Ilharco · Samir Gadre · Rebecca Roelofs · Raphael Gontijo Lopes · Ari Morcos · Hongseok Namkoong · Ali Farhadi · Yair Carmon · Simon Kornblith · Ludwig Schmidt