Unveiling Prior-data Fitted Networks on Causal Effect Estimation: Pre-training or Finetuning?
Abstract
Amortized causal inference via Prior-data Fitted Networks (PFNs) has emerged as a promising paradigm, enabling zero-shot estimation of causal effects without the need for dataset-specific model tuning. However, the principled effectiveness of unified pre-training across general interventional regimes remains an underexplored question. In this paper, we investigate interventions on subsets of variables within Structural Causal Models (SCMs) and identify a fundamental theoretical limitation of current pre-training approaches. Theoretically, we prove that a single observational SCM induces an exponentially large space of interventional distributions, resulting in a phenomenon we term prior uncoverage. Consequently, this uncoverage yields a mismatch between the learned meta-prior and the true grounding prior, leading to unavoidable posterior inconsistency and estimation bias. To address this, we posit that fine-tuning is a fundamental necessity and propose a target-specific strategy named Point-Wise Interventional Fine-tuning (PWF), enabling the local generalization property. We further scale this approach via Meta-Sampling Fine-tuning (MSF) from a budgeted active learning perspective, thereby achieving uniform generalization on any interventional distribution.