Fairness without Harm: Decoupled Classifiers with Preference Guarantees
Berk Ustun · Yang Liu · David Parkes

Thu Jun 13th 11:35 -- 11:40 AM @ Seaside Ballroom

Should we train different classifiers for groups defined by sensitive attributes, such as gender and ethnicity? In a domain such as medicine, it may be ethical to allow classifiers to vary by group membership -- so long as treatment disparity is aligned with the principles of beneficence (do the best") and non-maleficence (do no harm"). We argue that classifiers should satisfy {\em preference guarantees} for individuals who are subjected to disparate treatment: (i) the majority of individuals in each group should prefer their classifier in comparison to (i) a pooled classifier that makes no use of sensitive attributes ({\em rationality}, responsive to non-maleficence) and (ii) the classifier assigned to any other group ({\em envy-freeness}, responsive to beneficence). Standard decoupled training, which fits a separate classifier for each group, may fail (i) or (ii) due to data disparities or heterogeneity in the data generating distributions between groups. We introduce a {\em recursive decoupling procedure} that adaptively chooses group attributes for decoupling, and present formal conditions for achieving these preference guarantees. We illustrate the benefits of our approach through experiments on real-world datasets, showing that it can safely improve the groups defined by multiple sensitive attributes without violating preference guarantees on test data.

Author Information

Berk Ustun (Harvard University)
Yang Liu (UCSC)
David Parkes (Harvard University)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors