Oral
in
Workshop: Shift happens: Crowdsourcing metrics and test datasets beyond ImageNet
The Semantic Shift Benchmark
Sagar Vaze · Kai Han · Andrea Vedaldi · Andrew Zisserman
Most benchmarks for detecting semantic distribution shift do not consider how the semantics of the training set are defined. In other words, it is often unclear whether the 'unseen' images contain semantically different objects from the same distribution (e.g 'birds' for a model trained on 'cats' and 'dogs') or to a different distribution entirely (e.g Gaussian noise for a model trained on 'cats' and 'dogs'). In this work, we propose 'open-set' class splits for models trained on ImageNet-1K which come from ImageNet-21K. Critically, we structure the open-set classes based on semantic similarity to the closed-set using the WordNet hierarchy --- we create 'Easy' and 'Hard' open-set splits to allow more principled analysis of the semantic shift phenomenon.Together with similar challenges based on FGVC datasets, these evaluations comprise the 'Semantic Shift Benchmark'.