Timezone: »

Oral
Fairness without Harm: Decoupled Classifiers with Preference Guarantees
Berk Ustun · Yang Liu · David Parkes

Thu Jun 13 11:35 AM -- 11:40 AM (PDT) @ Seaside Ballroom

Should we train different classifiers for groups defined by sensitive attributes, such as gender and ethnicity? In a domain such as medicine, it may be ethical to allow classifiers to vary by group membership -- so long as treatment disparity is aligned with the principles of beneficence (do the best") and non-maleficence (do no harm"). We argue that classifiers should satisfy {\em preference guarantees} for individuals who are subjected to disparate treatment: (i) the majority of individuals in each group should prefer their classifier in comparison to (i) a pooled classifier that makes no use of sensitive attributes ({\em rationality}, responsive to non-maleficence) and (ii) the classifier assigned to any other group ({\em envy-freeness}, responsive to beneficence). Standard decoupled training, which fits a separate classifier for each group, may fail (i) or (ii) due to data disparities or heterogeneity in the data generating distributions between groups. We introduce a {\em recursive decoupling procedure} that adaptively chooses group attributes for decoupling, and present formal conditions for achieving these preference guarantees. We illustrate the benefits of our approach through experiments on real-world datasets, showing that it can safely improve the groups defined by multiple sensitive attributes without violating preference guarantees on test data.