Timezone: »

 
How well do contrastively trained models transfer?
M. Moein Shariatnia · Rahim Entezari · Mitchell Wortsman · Olga Saukh · Ludwig Schmidt
Event URL: https://openreview.net/forum?id=wS7lJi8NZg »

There are two prevailing methods for pre-training on large datasets to learn transferable representations: 1) supervised pre-training on large but weakly-labeled datasets; 2) contrastively training on image only and image, text pairs. While supervised pre-training learns good representations that can be transferred to a wide range of tasks, contrastively models such as CLIP have demonstrated unprecedented zero-shot transfer. In this work, we compare the transferability of the two aforementioned methods to multiple downstream tasks. The pre-training distributions we consider include YFCC, Conceptual Captions, and ImageNet-21K while pre-training objectives range from supervised to SimCLR, CLIP, and SLIP. We observe that different pre-training methods with the same training source transfer similarly given their ImageNet accuracy.

Author Information

M. Moein Shariatnia (Tehran University of Medical Sciences)
Rahim Entezari (TU Graz)
Mitchell Wortsman (University of Washington)
Olga Saukh (TU Graz)
Ludwig Schmidt (University of Washington)

More from the Same Authors