Skip to yearly menu bar Skip to main content


Poster

From ImageNet to Image Classification: Contextualizing Progress on Benchmarks

Dimitris Tsipras · Shibani Santurkar · Logan Engstrom · Andrew Ilyas · Aleksander Madry

Keywords: [ Deep Learning - General ] [ Other ] [ Crowdsourcing ] [ Computer Vision ]


Abstract:

Building rich machine learning datasets in a scalable manner often necessitates a crowd-sourced data collection pipeline. In this work, we use human studies to investigate the consequences of employing such a pipeline, focusing on the popular ImageNet dataset. We study how specific design choices in the ImageNet creation process impact the fidelity of the resulting dataset---including the introduction of biases that state-of-the-art models exploit. Our analysis pinpoints how a noisy data collection pipeline can lead to a systematic misalignment between the resulting benchmark and the real-world task it serves as a proxy for. Finally, our findings emphasize the need to augment our current model training and evaluation toolkit to take such misalignment into account.

Chat is not available.