Linearized Cluster Assignment via Spectral Ordering
Chris Ding - Lawrence Berkeley National Laboratory
Xiaofeng He - Lawrence Berkeley National Laboratory
Spectral clustering uses eigenvectors of the Laplacian of the similarity matrix. They are most conveniently applied to 2-way clustering problems. When applying to multi-way clustering, either the 2-way spectral clustering is recursively applied or an embedding to spectral space is done and some other methods are used to cluster the points. Here we propose and study a K-way cluster assignment method. The method transforms the problem to find valleys and peaks of a 1-D quantity called cluster crossing, which measures the symmetric cluster overlap across a cut point along a linear ordering of the data points. The method can either determine K clusters in one shot or recursively split a current cluster into several smaller ones. We show that a linear ordering based on a distance sensitive objective has a continuous solution which is the eigenvector of the Laplacian, showing the close relationship between clustering and ordering. The method relies on the connectivity matrix constructed as the truncated spectral expansion of the similarity matrix, useful for revealing cluster structure. The method is applied to newsgroups to illustrate introduced concepts; experiments show it outperforms the recursive 2-way clustering and the standard K-means clustering.