Poster
in
Workshop: Principles of Distribution Shift (PODS)
Diagnosing Model Performance Under Distribution Shift
Tianhui Cai · Hongseok Namkoong · Steve Yadlowsky
Prediction models perform poorly when deployed to distributions different from those seen during training. To understand these operational failure modes of ML models, we develop methods to attribute the drop in performance to different types of distribution shifts. Our approach decomposes the performance drop into 1) an increase in harder but frequently seen examples during training, 2) changes in the relationship between outcome and features, and 3) poor performance on examples infrequent or unseen during training. Our procedure is principled yet flexible enough to incorporate any feature mapping or metadata. We empirically demonstrate how our decomposition can inform different ways to improve model performance for different distribution shifts.