Abstract:
We study the problem of \emph{classifier derandomization} in machine learning: given a stochastic binary classifier , sample a deterministic classifier that approximates the output of in aggregate over any data distribution. Recent work revealed how to efficiently derandomize a stochastic classifier with strong output approximation guarantees, but at the cost of individual fairness --- that is, if treated similar inputs similarly, did not. In this paper, we initiate a systematic study of classifier derandomization with metric fairness guarantees. We show that the prior derandomization approach is almost maximally metric-unfair, and that a simple random threshold'' derandomization achieves optimal fairness preservation but with weaker output approximation. We then devise a derandomization procedure that provides an appealing tradeoff between these two: if is -metric fair according to a metric with a locality-sensitive hash (LSH) family, then our derandomized is, with high probability, -metric fair and a close approximation of . We also prove generic results applicable to all (fair and unfair) classifier derandomization procedures, including a bias-variance decomposition and reductions between various notions of metric fairness.
Chat is not available.