Skip to yearly menu bar Skip to main content


On hyperparameter tuning in general clustering problemsm

Xinjie Fan · Yuguang Yue · Purnamrita Sarkar · Y. X. Rachel Wang

Keywords: [ Networks and Relational Learning ] [ Unsupervised Learning ] [ Other ] [ Unsupervised and Semi-supervised Learning ]


Tuning hyperparameters for unsupervised learning problems is difficult in general due to the lack of ground truth for validation. However, the success of most clustering methods depends heavily on the correct choice of the involved hyperparameters. Take for example the Lagrange multipliers of penalty terms in semidefinite programming (SDP) relaxations of community detection in networks, or the bandwidth parameter needed in the Gaussian kernel used to construct similarity matrices for spectral clustering. Despite the popularity of these clustering algorithms, there are not many provable methods for tuning these hyperparameters. In this paper, we provide an overarching framework with provable guarantees for tuning hyperparameters in the above class of problems under two different models. Our framework can be augmented with a cross validation procedure to do model selection as well. In a variety of simulation and real data experiments, we show that our framework outperforms other widely used tuning procedures in a broad range of parameter settings.

Chat is not available.