Skip to yearly menu bar Skip to main content

Workshop: Workshop on Socially Responsible Machine Learning

Towards Explainable and Fair Supervised Learning

Aarshee Mishra · Nicholas Perello · Przemyslaw Grabowicz


Algorithms that aid human decision-making may inadvertently discriminate against certain protected groups. We formalize direct discrimination as a direct causal effect of the protected attributes on the decisions, while induced indirect discrimination as a change in the influence of non-protected features associated with the protected attributes. The measurements of average treatment effect (ATE) and SHapley Additive exPlanations (SHAP) reveal that state-of-the-art fair learning methods can inadvertently induce indirect discrimination in synthetic and real-world datasets. To inhibit discrimination in algorithmic systems, we propose to nullify the influence of the protected attribute on the output of the system, while preserving the influence of remaining features. To achieve this objective, we introduce a risk minimization method which optimizes for the proposed fairness objective. We show that the method leverages model accuracy and disparity measures.

Chat is not available.