Optimal Fair Aggregation of Crowdsourced Noisy Labels using Demographic Parity Constraints
Samuel Gruffaz ⋅ Gabriel Singer ⋅ Olivier VO VAN ⋅ Nicolas Vayatis ⋅ Argyris Kalogeratos
Abstract
In many machine learning applications acquiring reliable ground-truth labels is costly, or unfeasible, leading practitioners to rely on crowdsourcing and aggregation of noisy human annotations. When labels are subjective, however, aggregation may amplify individual biases, particularly with respect to sensitive attributes, raising fairness concerns. Despite this, fairness in crowdsourced aggregation remains largely unexplored, with no existing convergence guarantees and only limited post-processing approaches for enforcing $\varepsilon$-fairness under demographic parity. We address this gap by analyzing fairness properties of crowdsourcing aggregation methods within the $\varepsilon$-fairness framework, focusing on Majority Voting and Optimal Bayesian aggregation. In the small-crowd regime, we derive an upper bound on the fairness gap of Majority Voting in terms of the individual annotators’ fairness gaps. We further show that the fairness gap of the aggregated consensus converges exponentially fast to that of the ground truth under interpretable conditions. Since the ground truth itself may still be unfair, we generalize a state-of-the-art multiclass fairness post-processing algorithm from the continuous to the discrete setting, enabling the enforcement of strict demographic parity constraints for any aggregation rule. Experiments on both synthetic and real-world datasets demonstrate the effectiveness of our approach and corroborate the theoretical insights.
Successful Page Load