A Game-Theoretic Framework for Measuring and Explaining Metric Compatibility in Fair Machine Learning
Lingfeng Zhang ⋅ Jingran Yang ⋅ Zhaohui Wang ⋅ Min Zhang ⋅ Qing Zhang
Abstract
Machine learning fairness research documents trade-offs but lacks quantitative frameworks to measure intrinsic metric compatibility without requiring causal graphs. We introduce a game-theoretic framework that decomposes metrics into interaction vectors, enabling compatibility measurement between metrics via cosine similarity and mechanistic attribution to attribute coalitions. Through analysis of 6 datasets, 7 models, and 6 debiasing methods, we reveal that fairness and utility are often structurally orthogonal (median compatibility $\approx 0$) rather than diametrically opposed, with conflicts driven by sparse, low-order interactions. We further show that debiasing improves fairness by compressing the compatibility space—reducing compatibility of both synergistic and conflicting relationships—rather than eliminating conflicts, providing a mechanistic basis for understanding metric alignment.
Successful Page Load