Dual-stage Contrastive Learning-enhanced Multi-view Variational Clustering
Abstract
Multi-view clustering aims to obtain a consensus clustering by integrating complementary and consistent information from multiple views. However, two critical challenges still exist in variational methods: (1) view heterogeneity and noise often make fusion unreliable; (2) ambiguous posteriors and misassigned boundary samples impact the clustering performance. To address these issues, we propose Dual-stage Contrastive Learning-enhanced Multi-view Variational Clustering (DCL-MVC), which integrates contrastive learning into both the fusion and representation stages. Firstly, at the fusion stage, we introduce a fusion-then-attention mechanism to capture cross-view interactions and learn view-level attention weights for building a unified and reliable fused representation, and further introduce instance-level contrastive learning to enforce cross-view alignment at the instance level. Secondly, we focus on boundary samples with uncertain posteriors and refine their cluster assignments by using cluster-center contrastive loss to enlarge inter-cluster margins, while leveraging prototypical contrastive learning with a confidence-aware curriculum to promote intra-cluster compactness at the representation stage. Extensive experiments on six real-world datasets demonstrate consistent improvements over strong baselines and validate the contribution of each component.