Abstract:
We present algorithms that create coresets for clustering problems according to a wide subset of Bregman divergences. Our coresets have a small additive error, similar in magnitude to the lightweight coresets~\cite{bachem2018scalable}, and take an update time $O(d)$. We first give an algorithm that returns {\em online} coresets of size $\tilde{O}((dk\log k)/\epsilon^2\mu^2)$ for $k$-clustering according to any $\mu$-similar Bregman divergence. We also show the existence of {\em non-parametric} coresets, i.e. the coreset size is independent of the number of clusters $k$, as well the dimension of points $d$, for the same subclass of Bregman divergences.