Timezone: »
Reliable probability estimation is of crucial importance in many real-world applications where there is inherent (aleatoric) uncertainty. Probability-estimation models are trained on observed outcomes (e.g. whether it has rained or not, or whether a patient has died or not), because the ground-truth probabilities of the events of interest are typically unknown. The problem is therefore analogous to binary classification, with the difference that the objective is to estimate probabilities rather than predicting the specific outcome. This work investigates probability estimation from high-dimensional data using deep neural networks. There exist several methods to improve the probabilities generated by these models but they mostly focus on model (epistemic) uncertainty. For problems with inherent uncertainty, it is challenging to evaluate performance without access to ground-truth probabilities. To address this, we build a synthetic dataset to study and compare different computable metrics. We evaluate existing methods on the synthetic data as well as on three real-world probability estimation tasks, all of which involve inherent uncertainty: precipitation forecasting from radar images, predicting cancer patient survival from histopathology images, and predicting car crashes from dashcam videos. We also give a theoretical analysis of a model for high-dimensional probability estimation which reproduces several of the phenomena evinced in our experiments. Finally, we propose a new method for probability estimation using neural networks, which modifies the training process to promote output probabilities that are consistent with empirical probabilities computed from the data. The method outperforms existing approaches on most metrics on the simulated as well as real-world data.
Author Information
Sheng Liu (NYU)
Aakash Kaku (New York University)
Weicheng Zhu (New York University)
Matan Leibovich (New York University)
Sreyas Mohan (NYU)
Boyang Yu (NYU Center for Data Science)
Haoxiang Huang (New York University)
Laure Zanna (NYU)
Narges Razavian (New York University)
Jonathan Niles-Weed (NYU)
Carlos Fernandez-Granda
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Spotlight: Deep Probability Estimation »
Tue. Jul 19th 08:25 -- 08:30 PM Room Ballroom 1 & 2
More from the Same Authors
-
2023 Poster: Evaluating Unsupervised Denoising Requires Unsupervised Metrics »
AdriĆ Marcos Morales · Matan Leibovich · Sreyas Mohan · Joshua Vincent · Piyush Haluai · Mai Tan · Peter Crozier · Carlos Fernandez-Granda -
2023 Poster: Minimax estimation of discontinuous optimal transport maps: The semi-discrete case »
Aram-Alexandre Pooladian · Vincent Divol · Jonathan Niles-Weed -
2023 Poster: Perturbation Analysis of Neural Collapse »
Tom Tirer · Haoxiang Huang · Jonathan Niles-Weed -
2022 Poster: Debiaser Beware: Pitfalls of Centering Regularized Transport Maps »
Aram-Alexandre Pooladian · Marco Cuturi · Jonathan Niles-Weed -
2022 Spotlight: Debiaser Beware: Pitfalls of Centering Regularized Transport Maps »
Aram-Alexandre Pooladian · Marco Cuturi · Jonathan Niles-Weed -
2020 Poster: Supervised Quantile Normalization for Low Rank Matrix Factorization »
Marco Cuturi · Olivier Teboul · Jonathan Niles-Weed · Jean-Philippe Vert