Skip to yearly menu bar Skip to main content


Poster

Let Go of Your Labels with Unsupervised Transfer Learning

Artyom Gadetsky · Yulun Jiang · Maria Brbic


Abstract:

Foundation vision-language models have enabled remarkable zero-shot transferability of the pre-trained representations to a wide range of downstream tasks. However, zero-shot transfer still necessitates human guidance to define visual categories that appear in the data. Here, we show that fully unsupervised transfer emerges when searching for the labeling of a dataset that induces maximal margin classifiers in representation spaces of different foundation models. We present TURTLE, a fully unsupervised method that effectively employs this guiding principle to uncover the underlying labeling of a downstream dataset without any supervision and task-specific representationlearning. We evaluate the performance of TURTLE on a diverse benchmark suite and show that it outperforms zero-shot learning baselines on a wide range of 26 datasets. In particular, TURTLE matches the average performance of CLIP zero-shot on 26 datasets by employing the same representation space, spanning a wide range of architectures and model sizes. Remarkably, guiding the search of the underlying labeling using representation spaces of two foundation models surpasses zero-shot transfer, demonstrating the surprising power and effectiveness of unsupervised transfer learning.

Live content is unavailable. Log in and register to view live content