Causal-EPIG: Causally Aligned Active CATE Estimation
Abstract
Estimating the Conditional Average Treatment Effect (CATE) is constrained by the high cost of obtaining outcome measurements, making active learning essential. However, conventional strategies suffer from a fundamental objective mismatch: they reduce uncertainty in model parameters or observable outcomes rather than the unobservable causal quantities of interest. We address this via the principle of causal objective alignment, positing that acquisition functions must target unobservables like potential outcomes or CATE directly. We operationalize this through Causal-EPIG, a framework adapting Expected Predictive Information Gain to quantify uncertainty reduction in causal quantities. We derive two distinct strategies: a comprehensive approach that robustly models the full causal mechanisms via the joint potential outcomes, and a focused approach that directly targets the CATE estimand for maximum sample efficiency. We provide theoretical justification for our framework, establishing a formal link between our information-theoretic objective and the minimization of CATE estimation error. Extensive experiments demonstrate that our strategies consistently outperform standard baselines, and crucially, reveal that the optimal strategy is context-dependent, contingent on the base estimator and data complexity. Our framework thus provides a principled guide for sample-efficient CATE estimation in practice.