Skip to yearly menu bar Skip to main content


Oral
in
Workshop: Shift happens: Crowdsourcing metrics and test datasets beyond ImageNet

Contributed Talk 2: MetaShift: A Dataset of Datasets for Evaluating Contextual Distribution Shifts

Weixin Liang · Xinyu Yang · James Zou


Abstract:

Understanding the performance of machine learning models across diverse data distributions is critically important for reliable applications. Motivated by this, there is a growing focus on curating benchmark datasets that capture distribution shifts. In this work, we present MetaShift---a collection of 12,868 sets of natural images across 410 classes---to address this challenge. We leverage the natural heterogeneity of Visual Genome and its annotations to construct MetaShift. The key construction idea is to cluster images using its metadata, which provides context for each image (e.g. cats with cars or cats in bathroom) that represent distinct data distributions. MetaShift has two important benefits: first, it contains orders of magnitude more natural data shifts than previously available. Second, it provides explicit explanations of what is unique about each of its data sets and a distance score that measures the amount of distribution shift between any two of its data sets. Importantly, to support evaluating ImageNet trained models on MetaShift, we match MetaShift with ImageNet hierarchy. The matched version covers 867 out of 1,000 classes in ImageNet-1k. Each class in the ImageNet-matched Metashift contains 19.3 subsets capturing images in different contexts.

Chat is not available.