Oral

Using Pre-Training Can Improve Model Robustness and Uncertainty

Dan Hendrycks ⋅ Kimin Lee ⋅ Mantas Mazeika

2019 Oral

[ Slides] [ Video]

Abstract

Tuning a pre-trained network is commonly thought to improve data efficiency. However, Kaiming He et al. (2018) have called into question the utility of pre-training by showing that training from scratch can often yield similar performance, should the model train long enough. We show that although pre-training may not improve performance on traditional classification metrics, it does provide large benefits to model robustness and uncertainty. Through extensive experiments on label corruption, class imbalance, adversarial examples, out-of-distribution detection, and confidence calibration, we demonstrate large gains from pre-training and complementary effects with task-specific methods. Results include a 30% relative improvement in label noise robustness and a 10% absolute improvement in adversarial robustness on both CIFAR-10 and CIFAR-100. In some cases, using pre-training without task-specific methods surpasses the state-of-the-art, highlighting the importance of using pre-training when evaluating future methods on robustness and uncertainty tasks.

Chat is not available.