Spotlight
Sharper Generalization Bounds for Clustering
Shaojie Li · Yong Liu
Abstract:
Existing generalization analysis of clustering mainly focuses on specific instantiations, such as (kernel) -means, and a unified framework for studying clustering performance is still lacking. Besides, the existing excess clustering risk bounds are mostly of order provided that the underlying distribution has bounded support, where is the sample size and is the cluster numbers, or of order under strong assumptions on the underlying distribution, where these assumptions are hard to be verified in general. In this paper, we propose a unified clustering learning framework and investigate its excess risk bounds, obtaining state-of-the-art upper bounds under mild assumptions. Specifically, we derive sharper bounds of order under mild assumptions on the covering number of the hypothesis spaces, where these assumptions are easy to be verified. Moreover, for the hard clustering scheme, such as (kernel) -means, if just assume the hypothesis functions to be bounded, we improve the upper bounds from the order to . Furthermore, state-of-the-art bounds of faster order are obtained with the covering number assumptions.
Chat is not available.