Evaluating bivariate causal statements based on mutual compatibility
Erik Jahn ⋅ Dominik Janzing
Abstract
For many real-world systems, causal ground truth is difficult to obtain, making claims about causal effects hard to assess. We develop methods for evaluating collections of $\binom{n}{2}$ bivariate causal statements over a set of $n$ variables. In the setting of acyclic linear statements, any such collection can be extended to a unique multivariate causal model, but we argue that this induced model is implausible if it imposes substantial additional confounding to explain observed correlations. We introduce a compatibility score that quantifies this notion of plausibility, notably without relying on the faithfulness assumption. Additionally, we define an incompatibility score for purely graphical bivariate causal statements, based on global consistency constraints that are derived from acyclicity and faithfulness assumptions. We give theoretical and empirical evidence that both scores can successfully distinguish correct from incorrect causal statements in generic settings. Moreover, we demonstrate the practical applicability of our methods by analyzing causal claims made by large language models. Our work aims to provide a foundation for assessing the reliability of causal information derived from human experts or artificial intelligence in settings where alternative forms of validation are unavailable.
Successful Page Load