Directional Bias Amplification

Angelina Wang · Olga Russakovsky

Keywords: [ Learning Theory ] [ Reinforcement Learning and Planning ] [ Fairness, Accountability, and Transparency ] [ Exploration ] [ Algorithms -> Bandit Algorithms; Reinforcement Learning and Planning -> Reinforcement Learning; Theory ] [ Social Aspects of Machine Learning ]

[ Abstract ]
[ Paper ]
[ Visit Poster at Spot D4 in Virtual World ]
Thu 22 Jul 9 a.m. PDT — 11 a.m. PDT
Spotlight presentation: Fairness
Thu 22 Jul 5 a.m. PDT — 6 a.m. PDT

Abstract: Mitigating bias in machine learning systems requires refining our understanding of bias propagation pathways: from societal structures to large-scale data to trained models to impact on society. In this work, we focus on one aspect of the problem, namely bias amplification: the tendency of models to amplify the biases present in the data they are trained on. A metric for measuring bias amplification was introduced in the seminal work by Zhao et al. (2017); however, as we demonstrate, this metric suffers from a number of shortcomings including conflating different types of bias amplification and failing to account for varying base rates of protected attributes. We introduce and analyze a new, decoupled metric for measuring bias amplification, $BiasAmp_{\rightarrow}$ (Directional Bias Amplification). We thoroughly analyze and discuss both the technical assumptions and normative implications of this metric. We provide suggestions about its measurement by cautioning against predicting sensitive attributes, encouraging the use of confidence intervals due to fluctuations in the fairness of models across runs, and discussing the limitations of what this metric captures. Throughout this paper, we work to provide an interrogative look at the technical measurement of bias amplification, guided by our normative ideas of what we want it to encompass. Code is located at

Chat is not available.