Causal Identification from Counterfactual Data: Completeness and Bounding Results
Abstract
Previous work establishing completeness results for counterfactual identification has been limited to the setting where the input data belongs to observational and interventional distributions (Layers 1 and 2 of Pearl's Causal Hierarchy), since it was generally presumed impossible to obtain data from counterfactual distributions, belonging to Layer 3. However, recent work (Raghavan & Bareinboim, 2025) has formally characterized a family of counterfactual distributions which can be directly estimated via experimental methods - a notion they call counterfactual realizabilty. This leaves open the question of what additional Layer 3 quantities now become identifiable, given this new access to (some) Layer 3 data. We develop the ctfIDu+ algorithm for identifying a counterfactual query from an arbitrary set of Layer 3 data, and prove that it is complete for this task. Using this, we establish the theoretical limit of which counterfactuals can be identified from physically realizable data, thus implying the fundamental limit to exact causal inference in the non-parametric setting. Finally, we derive novel analytic bounds for important non-identifiable quantities given realizable counterfactual data, that are provably tighter than the previously established benchmark. We corroborate using simulations that even if a quantity is non-identifiable, counterfactual data can be used to further tighten bounds for its range.