Timezone: »
Conformal prediction is a theoretically grounded framework for constructing predictive intervals. We study conformal prediction with missing values in the covariates -- a setting that brings new challenges to uncertainty quantification. We first show that the marginal coverage guarantee of conformal prediction holds on imputed data for any missingness distribution and almost all imputation functions. However, we emphasize that the average coverage varies depending on the pattern of missing values: conformal methods tend to construct prediction intervals that under-cover the response conditionally to some missing patterns. This motivates our novel generalized conformalized quantile regression framework, missing data augmentation, which yields prediction intervals that are valid conditionally to the patterns of missing values, despite their exponential number. We then show that a universally consistent quantile regression algorithm trained on the imputed data is Bayes optimal for the pinball risk, thus achieving valid coverage conditionally to any given data point. Moreover, we examine the case of a linear model, which demonstrates the importance of our proposal in overcoming the heteroskedasticity induced by missing values. Using synthetic and data from critical care, we corroborate our theory and report improved performance of our methods.
Author Information
Margaux Zaffran (INRIA)
Aymeric Dieuleveut (École polytechnique)
Julie Josse (Polytechnique)
Yaniv Romano (Technion---Israel Institute of Technology)
More from the Same Authors
-
2023 : Continuous Vector Quantile Regression »
Sanketh Vedula · Irene Tallini · Aviv A. Rosenberg · Marco Pegoraro · Emanuele Rodola · Yaniv Romano · Alexander Bronstein -
2023 Poster: Naive imputation implicitly regularizes high-dimensional linear models »
Alexis Ayme · Claire Boyer · Aymeric Dieuleveut · Erwan Scornet -
2022 : Spotlight Presentations »
Adrian Weller · Osbert Bastani · Jake Snell · Tal Schuster · Stephen Bates · Zhendong Wang · Margaux Zaffran · Danielle Rasooly · Varun Babbar -
2022 Poster: An Asymptotic Test for Conditional Independence using Analytic Kernel Embeddings »
Meyer Scetbon · Laurent Meunier · Yaniv Romano -
2022 Poster: Adaptive Conformal Predictions for Time Series »
Margaux Zaffran · Olivier FERON · Yannig Goude · julie Josse · Aymeric Dieuleveut -
2022 Spotlight: Adaptive Conformal Predictions for Time Series »
Margaux Zaffran · Olivier FERON · Yannig Goude · julie Josse · Aymeric Dieuleveut -
2022 Spotlight: An Asymptotic Test for Conditional Independence using Analytic Kernel Embeddings »
Meyer Scetbon · Laurent Meunier · Yaniv Romano -
2022 Poster: Near-optimal rate of consistency for linear models with missing values »
Alexis Ayme · Claire Boyer · Aymeric Dieuleveut · Erwan Scornet -
2022 Spotlight: Near-optimal rate of consistency for linear models with missing values »
Alexis Ayme · Claire Boyer · Aymeric Dieuleveut · Erwan Scornet -
2022 Poster: Image-to-Image Regression with Distribution-Free Uncertainty Quantification and Applications in Imaging »
Anastasios Angelopoulos · Amit Pal Kohli · Stephen Bates · Michael Jordan · Jitendra Malik · Thayer Alshaabi · Srigokul Upadhyayula · Yaniv Romano -
2022 Poster: Coordinated Double Machine Learning »
Nitai Fingerhut · Matteo Sesia · Yaniv Romano -
2022 Spotlight: Coordinated Double Machine Learning »
Nitai Fingerhut · Matteo Sesia · Yaniv Romano -
2022 Spotlight: Image-to-Image Regression with Distribution-Free Uncertainty Quantification and Applications in Imaging »
Anastasios Angelopoulos · Amit Pal Kohli · Stephen Bates · Michael Jordan · Jitendra Malik · Thayer Alshaabi · Srigokul Upadhyayula · Yaniv Romano -
2020 Workshop: Learning with Missing Values »
Julie Josse · Jes Frellsen · Pierre-Alexandre Mattei · Gael Varoquaux -
2020 : Opening Session »
Julie Josse · Jes Frellsen · Pierre-Alexandre Mattei · Gael Varoquaux -
2020 Poster: Missing Data Imputation using Optimal Transport »
Boris Muzellec · Julie Josse · Claire Boyer · Marco Cuturi -
2020 Poster: On Convergence-Diagnostic based Step Sizes for Stochastic Gradient Descent »
Scott Pesme · Aymeric Dieuleveut · Nicolas Flammarion