Precision-Induced Miscalibration: Understanding and Correcting Confidence Distortion in Quantized Neural Networks
Jiawei Gu ⋅ Fengyuan Nie ⋅ Hao Tang ⋅ Yanpeng Sun
Abstract
Low-precision arithmetic is pervasive in neural network training and deployment, yet its effect on prediction \textit{confidence}, not just accuracy, remains unexamined. We show that the softmax function amplifies logit-space quantization errors in an input-dependent manner: confidence distortion scales with the product of precision-dependent error bound $\epsilon$ and logit norm, peaking when the model is confident but not saturated. This explains why identical models report different confidence values across precisions, a phenomenon we term \textit{Precision Split}. During training, the same mechanism causes gradient underflow: when logit margins exceed a precision-dependent threshold, gradients vanish and samples silently stop contributing to learning. Since logit norm serves as a computable proxy for precision-induced risk, we propose Precision-Aware Confidence Scaling (PACS), which applies sample-adaptive temperature inversely related to this risk, with sub-one-percent overhead and no full-precision computation required. On ImageNet with mixed-precision ResNet-50, PACS reduces Expected Calibration Error from 5.82\% to 1.92\% while maintaining accuracy, with consistent improvements across architectures, precision formats, and modalities.
Successful Page Load