Timezone: »

Extending the WILDS Benchmark for Unsupervised Adaptation
Shiori Sagawa

Sat Jul 23 06:50 AM -- 07:10 AM (PDT) @

Machine learning models deployed in the real world constantly face distribution shifts, and these distribution shifts can significantly degrade model performance. In this talk, I will present the WILDS benchmark of real-world distribution shifts, focusing on the version 2.0 update that adds curated unlabeled data. Unlabeled data can be a powerful leverage for improving out-of-distribution performance, but existing distribution shift benchmarks with unlabeled data do not reflect the breadth of scenarios that arise in real-world applications. To this end, we provide unlabeled data to 8 out of 10 datasets in WILDS, spanning diverse applications and modalities. We observe that existing methods fail to improve out-of-distribution performance on WILDS, even though these methods have been successful on existing benchmarks with different types of distribution shifts. This underscores the importance of developing and evaluating methods on diverse types of distribution shifts, including directly on shifts that arise in practice.

Author Information

Shiori Sagawa (Stanford University)

More from the Same Authors