Oral
Supervised Hierarchical Clustering with Exponential Linkage
Nishant Yadav · Ari Kobren · Nicholas Monath · Andrew McCallum

Thu Jun 13th 12:15 -- 12:20 PM @ Room 103

In supervised clustering, standard techniques for learning a dissimilarity function often present a mismatch between the training and clustering objectives. This mismatch leads to poor performance, which we demonstrate in the case where training maximizes prediction accuracy on all within- and across-cluster pairs and clustering is performed with agglomerative clustering with single linkage. We present a training procedure tailored specifically to single linkage clustering that results in improved performance. Since designing specialized training procedures is cumbersome, we introduce the parametric family of exponential linkage functions, which smoothly interpolates between single, average and complete linkage, and give a training procedure that jointly selects a linkage from the family and learns a dissimilarity function suited for that linkage. In experiments on four datasets, our training procedure leads to improvements of up to 6\% dendrogram purity over all pairs training and consistently matches or outperforms the next best linkage/training-procedure pair on three out of four datasets.

Author Information

Nishant Yadav (University of Massachusetts Amherst)
Ari Kobren (University of Massachusetts Amherst)
Nicholas Monath (University of Massachusetts Amherst)
Andrew McCallum (UMass Amherst)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors