When Personalization Harms Performance: Reconsidering the Use of Group Attributes in Prediction
Vinith Suriyakumar · Marzyeh Ghassemi · Berk Ustun

Thu Jul 27 04:30 PM -- 06:00 PM (PDT) @ Exhibit Hall 1 #209

Machine learning models are often personalized with categorical attributes that define groups. In this work, we show that personalization with group attributes can inadvertently reduce performance at a group level -- i.e., groups may receive unnecessarily inaccurate predictions by sharing their personal characteristics. We present formal conditions to ensure the fair use of group attributes in a prediction task, and describe how they can be checked by training one additional model. We characterize how fair use conditions be violated due to standard practices in model development, and study the prevalence of fair use violations in clinical prediction tasks. Our results show that personalization often fails to produce a tailored performance gain for every group who reports personal data, and underscore the need to evaluate fair use when personalizing models with characteristics that are protected, sensitive, self-reported, or costly to acquire.

