Timezone: »

Assessing Generalization of SGD via Disagreement Rates
YiDing Jiang · Vaishnavh Nagarajan · Zico Kolter

We empirically show that the test error of deep networks can be estimated simply by training the same architecture on the same training set but with a different run of SGD, and measuring the disagreement rate between the two networks on unlabeled test data. This builds on -- and is a stronger version of -- the observation in Nakirran & Bansal (20), which requires the second run to be on an altogether fresh training set. We further theoretically show that this peculiar phenomenon arises from the well-calibrated nature of ensembles of SGD-trained models. This finding not only provides a simple empirical measure to directly predict the test error using unlabeled test data, but also establishes a new conceptual connection between generalization and calibration.

Author Information

YiDing Jiang (Carnegie Mellon University)
Vaishnavh Nagarajan (Carnegie Mellon University)

I work in machine learning and deep learning theory.

Zico Kolter (Carnegie Mellon University / Bosch Center for AI)

More from the Same Authors