Skip to yearly menu bar Skip to main content


Coresets for Ordered Weighted Clustering

Vladimir Braverman · Shaofeng Jiang · Robert Krauthgamer · Xuan Wu

Pacific Ballroom #191

Keywords: [ Unsupervised Learning ] [ Fairness ] [ Clustering ]

Abstract: We design coresets for Ordered k-Median, a generalization of classical clustering problems such as k-Median and k-Center. Its objective function is defined via the Ordered Weighted Averaging (OWA) paradigm of Yager (1988), where data points are weighted according to a predefined weight vector, but in order of their contribution to the objective (distance from the centers). A powerful data-reduction technique, called a coreset, is to summarize a point set $X$ in $\mathbb{R}^d$ into a small (weighted) point set $X'$, such that for every set of $k$ potential centers, the objective value of the coreset $X'$ approximates that of $X$ within factor $1\pm \epsilon$. When there are multiple objectives (weights), the above standard coreset might have limited usefulness, whereas in a \emph{simultaneous} coreset, the above approximation holds for all weights (in addition to all centers). Our main result is a construction of a simultaneous coreset of size $O_{\epsilon, d}(k^2 \log^2 |X|)$ for Ordered k-Median. We validate our algorithm on a real geographical data set, and we find our coreset leads to a massive speedup of clustering computations, while maintaining high accuracy for a range of weights.

Live content is unavailable. Log in and register to view live content