Topology-Aware Contrastive Learning: Regulating Representation Connectivity via Persistent Homology
Abstract
Standard contrastive learning minimizes geometric distance between positive pairs, implicitly assuming that strict compactness optimizes discrimination. However, this topology-agnostic confusion neglects intrinsic data structures and topological complexity, leading to class confusion—particularly when aggressive augmentations induce semantic drift. To address this, we propose Topology-Aware Contrastive Learning, a framework that shifts the objective from geometric singularity to topological connectivity. Leveraging Persistent Homology, we explicitly regulate the connectivity of the latent space, ensuring positive pairs maintain an α–β that balances intra-class cohesion with separability. Theoretically, we formally define the topology-agnostic confusion phenomenon, prove that excessive compactness strictly lower-bounds the probability of confusion and derive a generalization bound demonstrating that richer topological connectivity tightens downstream risk. Furthermore, we establish a measure-theoretic framework to mitigating the sensitivity of our method against varying augmentation strengths. Empirical results on benchmarks confirm that our approach enhances representation quality and reduces reliance on specific augmentation strategies compared to standard baselines. Our code will be made publicly available upon acceptance.