Timezone: »

Don’t Just Blame Over-parametrization for Over-confidence: Theoretical Analysis of Calibration in Binary Classification
Yu Bai · Song Mei · Huan Wang · Caiming Xiong

Wed Jul 21 05:35 AM -- 05:40 AM (PDT) @

Modern machine learning models with high accuracy are often miscalibrated---the predicted top probability does not reflect the actual accuracy, and tends to be \emph{over-confident}. It is commonly believed that such over-confidence is mainly due to \emph{over-parametrization}, in particular when the model is large enough to memorize the training data and maximize the confidence.

In this paper, we show theoretically that over-parametrization is not the only reason for over-confidence. We prove that \emph{logistic regression is inherently over-confident}, in the realizable, under-parametrized setting where the data is generated from the logistic model, and the sample size is much larger than the number of parameters. Further, this over-confidence happens for general well-specified binary classification problems as long as the activation is symmetric and concave on the positive part. Perhaps surprisingly, we also show that over-confidence is not always the case---there exists another activation function (and a suitable loss function) under which the learned classifier is \emph{under-confident} at some probability values. Overall, our theory provides a precise characterization of calibration in realizable binary classification, which we verify on simulations and real data experiments.

Author Information

Yu Bai (Salesforce Research)
Song Mei (UC Berkeley)
Huan Wang (Salesforce Research)
Caiming Xiong (Salesforce)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors