Workshop: The First Workshop on Pre-training: Perspectives, Pitfalls, and Paths Forward

Contrastive Learning Can Find An Optimal Basis For Approximately Invariant Functions

Daniel D. Johnson · Daniel D. Johnson · Ayoub El Hanchi · Ayoub El Hanchi · Chris Maddison · Chris Maddison


Contrastive learning is a powerful framework for learning self-supervised representations that generalize well to downstream supervised tasks. We show that multiple existing contrastive learning methods can be reinterpeted as learning a positive-definite kernel that approximates a particular contrastive kernel defined by the positive pairs. The principal components of the data under this kernel exactly correspond to the eigenfunctions of a positive-pair Markov chain, and these eigenfunctions can be used to build a representation thatprovably minimizes the worst-case approximation error of linear predictors under the assumption that positive pairs have similar labels. We give generalization bounds for downstream linear prediction using this optimal representation, and show how to approximate this representation using kernel PCA. We also explore kernel-based representations on a noisy MNIST task for which the positive pair distribution has a closed form, and compare the properties of the true eigenfunctions with their learned approximations.

Chat is not available.