Timezone: »
Generalization is the main goal in machine learning, but few researchers systematically investigate how well models perform on truly unseen data. This raises the danger that the community may be overfitting to excessively re-used test sets. To investigate this question, we conduct a novel reproducibility experiment on CIFAR-10 and ImageNet by assembling new test sets and then evaluating a wide range of classification models. Despite our careful efforts to match the distribution of the original datasets, the accuracy of many models drops around 10%. However, accuracy gains on the original test sets translate to larger gains on the new test sets. Our results show that the accuracy drops are likely not caused by adaptive overfitting, but by the models' inability to generalize reliably to slightly "harder" images than those found in the original test set.
Author Information
Benjamin Recht (Berkeley)
Benjamin Recht is an Associate Professor in the Department of Electrical Engineering and Computer Sciences at the University of California, Berkeley. Ben's research group studies the theory and practice of optimization algorithms with a focus on applications in machine learning, data analysis, and controls. Ben is the recipient of a Presidential Early Career Awards for Scientists and Engineers, an Alfred P. Sloan Research Fellowship, the 2012 SIAM/MOS Lagrange Prize in Continuous Optimization, the 2014 Jamon Prize, the 2015 William O. Baker Award for Initiatives in Research, and the 2017 NIPS Test of Time Award.
Becca Roelofs (University of California Berkeley)
Ludwig Schmidt (University of California, Berkeley)
Vaishaal Shankar (UC Berkeley)
Related Events (a corresponding poster, oral, or spotlight)
-
2019 Poster: Do ImageNet Classifiers Generalize to ImageNet? »
Thu Jun 13th 01:30 -- 04:00 AM Room Pacific Ballroom
More from the Same Authors
-
2020 Poster: Neural Kernels Without Tangents »
Vaishaal Shankar · Alex Fang · Wenshuo Guo · Sara Fridovich-Keil · Jonathan Ragan-Kelley · Ludwig Schmidt · Benjamin Recht -
2020 Poster: Evaluating Machine Accuracy on ImageNet »
Vaishaal Shankar · Rebecca Roelofs · Horia Mania · Alex Fang · Benjamin Recht · Ludwig Schmidt -
2020 Poster: The Effect of Natural Distribution Shift on Question Answering Models »
John Miller · Karl Krauth · Benjamin Recht · Ludwig Schmidt -
2019 Workshop: Identifying and Understanding Deep Learning Phenomena »
Hanie Sedghi · Samy Bengio · Kenji Hata · Aleksander Madry · Ari Morcos · Behnam Neyshabur · Maithra Raghu · Ali Rahimi · Ludwig Schmidt · Ying Xiao -
2018 Poster: Least-Squares Temporal Difference Learning for the Linear Quadratic Regulator »
Stephen Tu · Benjamin Recht -
2018 Oral: Least-Squares Temporal Difference Learning for the Linear Quadratic Regulator »
Stephen Tu · Benjamin Recht -
2018 Tutorial: Optimization Perspectives on Learning to Control »
Benjamin Recht -
2017 Poster: Breaking Locality Accelerates Block Gauss-Seidel »
Stephen Tu · Shivaram Venkataraman · Ashia Wilson · Alex Gittens · Michael Jordan · Benjamin Recht -
2017 Talk: Breaking Locality Accelerates Block Gauss-Seidel »
Stephen Tu · Shivaram Venkataraman · Ashia Wilson · Alex Gittens · Michael Jordan · Benjamin Recht