Prediction-Powered Adaptive Inference with Pretrained AI Models for Contextual Bandits
Abstract
In adaptive experiments, statistical inference is essential for reliable decision-making and scientific discovery. Often in these settings, collecting labeled data is expensive, but decision-makers have access to large unlabeled datasets and strong pretrained AI models that can generate outcome predictions. Effectively leveraging these predictions in online experiments poses fundamental challenges for statistical inference: AI models may be misspecified, and data collected under adaptive policies are inherently non-i.i.d., invalidating classical inference techniques. To address these challenges, we propose a Prediction-Powered Adaptive Inference (PPAI) estimator that integrates unlabeled data, predicted labels, and adaptively collected labeled data through a single estimating equation. We establish asymptotic normality of the PPAI estimator under mild conditions on the data-collection policy, enabling valid confidence intervals and hypothesis tests for a broad class of Z-functionals. The method incorporates a data-driven tuning mechanism that adaptively weights AI predictions according to their informativeness, guaranteeing that the resulting asymptotic variance is no worse than that of the labeled-only baseline, and is strictly smaller when predictions are informative. Numerical experiments further support the theory, illustrating efficiency gains with informative AI predictions and robust performance when predictions are inaccurate.