Skip to yearly menu bar Skip to main content

Workshop: The First Workshop on Pre-training: Perspectives, Pitfalls, and Paths Forward

How robust are pre-trained models to distribution shift?

Yuge Shi · Imant Daunhawer · Julia Vogt · Phil Torr · Amartya Sanyal


The vulnerability of machine learning models to spurious correlations has mostly been discussed in the context of supervised learning (SL). However, there is a lack of insight on how spurious correlations affect the performance of popular self-supervised learning (SSL) and auto-encoder based models (AE). In this work, we shed light on this by evaluating the performance of these models on both real world and synthetic distribution shift datasets. Following observations that the linear head itself can be susceptible to spurious correlations, we develop a new evaluation scheme with the linear head trained on out-of-distribution (OOD) data, to isolate the performance of the pre-trained models from a potential bias of the linear head used for evaluation. With this new methodology, we show that SSL models are consistently more robust to distribution shifts and thus better at OOD generalisation than AE and SL models.

Chat is not available.