Off-Policy Evaluation Beyond Overlap under Network Interference
Abstract
Offline Policy Evaluation (OPE) aims to estimate the value of a target policy from historical logged data without interating with the environment, thereby assessing policy performance. In settings with network interference, individuals no longer satisfy the SUTVA assumption: an individual’s outcome is influenced not only by their own treatment but also by the treatments of their neighbors, which makes the definition and estimation of policy value more complex. To capture this interference mechanism, we allow all neighbors to affect individual outcomes through a unified exposure mapping, and we use a decaying higher-order neighborhood aggregation to characterize the influence of more distant neighbors. Moreover, in real-world applications, the target policy and the logging policy often do not fully overlap (non-overlap), so the policy value in non-overlap regions cannot be point-identified. To address this issue, we partially identify the policy value over non-overlap regions and, under a smoothness assumption, formulate the estimation of the lower and upper bounds as a linear program, yielding valid bounds on the offline policy value. Finally, we conduct systematic experiments on semi-synthetic network data to validate the effectiveness and robustness of the proposed method under network interference and limited overlap.