Uniform Deviation Bounds for k-Means Clustering
Olivier Bachem · Mario Lucic · Hamed Hassani · Andreas Krause

Tue Aug 8th 01:30 -- 01:48 PM @ C4.8

Uniform deviation bounds limit the difference between a model's expected loss and its loss on an empirical sample uniformly for all models in a learning problem. In this paper, we provide a novel framework to obtain uniform deviation bounds for loss functions which are unbounded. As a result, we obtain competitive uniform deviation bounds for k-Means clustering under weak assumptions on the underlying distribution. If the fourth moment is bounded, we prove a rate of O(m^(-1/2)) compared to the previously known O(m^(-1/4)) rate. Furthermore, we show that the rate also depends on the kurtosis - the normalized fourth moment which measures the "tailedness" of a distribution. We also provide improved rates under progressively stronger assumptions, namely, bounded higher moments, subgaussianity and bounded support of the underlying distribution.

Author Information

Olivier Bachem (ETH Zurich)
Mario Lucic (ETH Zurich)
Hamed Hassani (ETH Zurich)
Andreas Krause (ETH Zurich)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors