Timezone: »
We evaluate a wide range of ImageNet models with five trained human labelers. In our year-long experiment, trained humans first annotated 40,000 images from the ImageNet and ImageNetV2 test sets with multi-class labels to enable a semantically coherent evaluation. Then we measured the classification accuracy of the five trained humans on the full task with 1,000 classes. Only the latest models from 2020 are on par with our best human labeler, and human accuracy on the 590 object classes is still 4% and 10% higher than the best model on ImageNet and ImageNetV2, respectively. Moreover, humans achieve the same accuracy on ImageNet and ImageNetV2, while all models see a consistent accuracy drop. Overall, our results show that there is still substantial room for improvement on ImageNet and direct accuracy comparisons between humans and machines may overstate machine performance.
Author Information
Vaishaal Shankar (UC Berkeley)
Rebecca Roelofs (Google)
Horia Mania (UC Berkeley)
Alex Fang (UC Berkeley)
Benjamin Recht (Berkeley)
Benjamin Recht is an Associate Professor in the Department of Electrical Engineering and Computer Sciences at the University of California, Berkeley. Ben's research group studies the theory and practice of optimization algorithms with a focus on applications in machine learning, data analysis, and controls. Ben is the recipient of a Presidential Early Career Awards for Scientists and Engineers, an Alfred P. Sloan Research Fellowship, the 2012 SIAM/MOS Lagrange Prize in Continuous Optimization, the 2014 Jamon Prize, the 2015 William O. Baker Award for Initiatives in Research, and the 2017 NIPS Test of Time Award.
Ludwig Schmidt (University of California, Berkeley)
More from the Same Authors
-
2023 Poster: Robustness in Multimodal Learning under Train-Test Modality Mismatch »
Brandon McKinzie · Vaishaal Shankar · Joseph Cheng · Yinfei Yang · Jonathon Shlens · Alexander Toshev -
2022 Poster: Data Determines Distributional Robustness in Contrastive Language Image Pre-training (CLIP) »
Alex Fang · Gabriel Ilharco · Mitchell Wortsman · Yuhao Wan · Vaishaal Shankar · Achal Dave · Ludwig Schmidt -
2022 Spotlight: Data Determines Distributional Robustness in Contrastive Language Image Pre-training (CLIP) »
Alex Fang · Gabriel Ilharco · Mitchell Wortsman · Yuhao Wan · Vaishaal Shankar · Achal Dave · Ludwig Schmidt -
2021 Poster: Quantifying Availability and Discovery in Recommender Systems via Stochastic Reachability »
Mihaela Curmei · Sarah Dean · Benjamin Recht -
2021 Spotlight: Quantifying Availability and Discovery in Recommender Systems via Stochastic Reachability »
Mihaela Curmei · Sarah Dean · Benjamin Recht -
2021 Poster: Accuracy on the Line: on the Strong Correlation Between Out-of-Distribution and In-Distribution Generalization »
John Miller · Rohan Taori · Aditi Raghunathan · Shiori Sagawa · Pang Wei Koh · Vaishaal Shankar · Percy Liang · Yair Carmon · Ludwig Schmidt -
2021 Poster: Representation Matters: Assessing the Importance of Subgroup Allocations in Training Data »
Esther Rolf · Theodora Worledge · Benjamin Recht · Michael Jordan -
2021 Spotlight: Accuracy on the Line: on the Strong Correlation Between Out-of-Distribution and In-Distribution Generalization »
John Miller · Rohan Taori · Aditi Raghunathan · Shiori Sagawa · Pang Wei Koh · Vaishaal Shankar · Percy Liang · Yair Carmon · Ludwig Schmidt -
2021 Spotlight: Representation Matters: Assessing the Importance of Subgroup Allocations in Training Data »
Esther Rolf · Theodora Worledge · Benjamin Recht · Michael Jordan -
2020 Poster: Neural Kernels Without Tangents »
Vaishaal Shankar · Alex Fang · Wenshuo Guo · Sara Fridovich-Keil · Jonathan Ragan-Kelley · Ludwig Schmidt · Benjamin Recht -
2020 Poster: The Effect of Natural Distribution Shift on Question Answering Models »
John Miller · Karl Krauth · Benjamin Recht · Ludwig Schmidt -
2019 Workshop: Identifying and Understanding Deep Learning Phenomena »
Hanie Sedghi · Samy Bengio · Kenji Hata · Aleksander Madry · Ari Morcos · Behnam Neyshabur · Maithra Raghu · Ali Rahimi · Ludwig Schmidt · Ying Xiao -
2019 Poster: Do ImageNet Classifiers Generalize to ImageNet? »
Benjamin Recht · Rebecca Roelofs · Ludwig Schmidt · Vaishaal Shankar -
2019 Oral: Do ImageNet Classifiers Generalize to ImageNet? »
Benjamin Recht · Rebecca Roelofs · Ludwig Schmidt · Vaishaal Shankar -
2018 Poster: Least-Squares Temporal Difference Learning for the Linear Quadratic Regulator »
Stephen Tu · Benjamin Recht -
2018 Oral: Least-Squares Temporal Difference Learning for the Linear Quadratic Regulator »
Stephen Tu · Benjamin Recht -
2018 Tutorial: Optimization Perspectives on Learning to Control »
Benjamin Recht -
2017 Poster: Breaking Locality Accelerates Block Gauss-Seidel »
Stephen Tu · Shivaram Venkataraman · Ashia Wilson · Alex Gittens · Michael Jordan · Benjamin Recht -
2017 Talk: Breaking Locality Accelerates Block Gauss-Seidel »
Stephen Tu · Shivaram Venkataraman · Ashia Wilson · Alex Gittens · Michael Jordan · Benjamin Recht