Skip to yearly menu bar Skip to main content


Scalable Fair Clustering

Arturs Backurs · Piotr Indyk · Krzysztof Onak · Baruch Schieber · Ali Vakilian · Tal Wagner

Pacific Ballroom #84

Keywords: [ Large Scale Learning and Big Data ] [ Fairness ] [ Combinatorial Optimization ]

Abstract: We study the fair variant of the classic k-median problem introduced by (Chierichetti et al., NeurIPS 2017) in which the points are colored, and the goal is to minimize the same average distance objective as in the standard $k$-median problem while ensuring that all clusters have an ``approximately equal'' number of points of each color. (Chierichetti et al., NeurIPS 2017) proposed a two-phase algorithm for fair $k$-clustering. In the first step, the pointset is partitioned into subsets called fairlets that satisfy the fairness requirement and approximately preserve the k-median objective. In the second step, fairlets are merged into k clusters by one of the existing k-median algorithms. The running time of this algorithm is dominated by the first step, which takes super-quadratic time. In this paper, we present a practical approximate fairlet decomposition algorithm that runs in nearly linear time.

Live content is unavailable. Log in and register to view live content