Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Structured Probabilistic Inference and Generative Modeling

Beyond Confidence: Reliable Models Should Also Consider Atypicality

Mert Yuksekgonul · Linjun Zhang · James Zou · Carlos Guestrin

Keywords: [ calibration ] [ reliable machine learning ] [ uncertainty ] [ Trustworthy Machine Learning ]


Abstract:

While most machine learning models can provide confidence in their predictions, confidence is insufficient to understand a prediction's reliability. For instance, the model may have a low confidence prediction if the input is not well-represented in the training dataset or if the input is inherently ambiguous. In this work, we investigate the relationship between how atypical~(rare) a sample or a class is and the reliability of a model's predictions. We first demonstrate that atypicality is strongly related to miscalibration and accuracy. In particular, we empirically show that predictions for atypical inputs or atypical classes are more overconfident and have lower accuracy. Using these insights, we show incorporating atypicality improves uncertainty quantification and model performance for discriminative neural networks and large language models. In a case study, we show that using atypicality improves the performance of a skin lesion classifier across different skin tone groups without having access to the group attributes. Overall, \emph{we propose that models should use not only confidence but also atypicality to improve uncertainty quantification and performance}. Our results show that simple atypicality estimators already provide large benefits.

Chat is not available.