Correcting Overparameterization Effects in Fair Empirical Risk Minimization
Abstract
Bias mitigation is particularly challenging for overparameterized machine learning (ML) models. Overfitting of training points not only amplifies data bias induced by spurious correlations, but also causes the failure of usual bias mitigation methods. To provide actionable insights to address this challenge, we propose a precise analysis of fair empirical risk minimization (ERM) in the overparameterized regime. Importantly, we show that even though conventional fair ERM fails on overparameterized models, this approach can be corrected by modifying the equality fairness constraint to allow for bias overcompensation. Moreover, our analysis presents an empirical criterion for strong equalized odds: balanced group-conditional means of representer coefficients, indicating equal average contribution from each sensitive group. Motivated by this result, we provide an estimable search interval that localizes the required overcompensation level for balanced coefficients. Despite the asymptotic nature of our findings, they closely capture the statistical behavior of moderately large ML models.