Using Pre-Training Can Improve Model Robustness and Uncertainty
Dan Hendrycks · Kimin Lee · Mantas Mazeika

Tue Jun 11th 12:05 -- 12:10 PM @ Grand Ballroom

Tuning a pre-trained network is commonly thought to improve data efficiency. However, Kaiming He et al. (2018) have called into question the utility of pre-training by showing that training from scratch can often yield similar performance, should the model train long enough. We show that although pre-training may not improve performance on traditional classification metrics, it does provide large benefits to model robustness and uncertainty. Through extensive experiments on label corruption, class imbalance, adversarial examples, out-of-distribution detection, and confidence calibration, we demonstrate large gains from pre-training and complementary effects with task-specific methods. Results include a 30% relative improvement in label noise robustness and a 10% absolute improvement in adversarial robustness on both CIFAR-10 and CIFAR-100. In some cases, using pre-training without task-specific methods surpasses the state-of-the-art, highlighting the importance of using pre-training when evaluating future methods on robustness and uncertainty tasks.

Author Information

Dan Hendrycks (UC Berkeley)
Kimin Lee (KAIST)
Mantas Mazeika (University of Chicago)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors