Causal Structure Discovery from Distributions Arising from Mixtures of DAGs
Basil Saeed · Snigdha Panigrahi · Caroline Uhler

We consider distributions arising from a mixture of causal models, where each model is represented by a directed acyclic graph (DAG). We provide a graphical representation of such mixture distributions and prove that this representation encodes the conditional independence relations of the mixture distribution. We then consider the problem of structure learning based on samples from such distributions. Since the mixing variable is latent, we consider causal structure discovery algorithms such as FCI that can deal with latent variables. We show that such algorithms recover a “union” of the component DAGs and can identify variables whose conditional distribution across the component DAGs vary. We demonstrate our results on synthetic and real data showing that the inferred graph identifies nodes that vary between the different mixture components. As an immediate application, we demonstrate how retrieval of this causal information can be used to cluster samples according to each mixture component.

Basil Saeed (MIT)
Snigdha Panigrahi (University of Michigan)
Caroline Uhler (Massachusetts Institute of Technology)
Caroline Uhler joined the MIT faculty in 2015 as the Henry L. and Grace Doherty assistant professor in the Department of Electrical Engineering and Computer Science and the Institute for Data, Systems, and Society. She holds an MSc in mathematics, a BSc in biology, and an MEd in high school mathematics education from the University of Zurich. She obtained her PhD in statistics, with a designated emphasis in computational and genomic biology, from the University of California, Berkeley. Before joining MIT, she spent a semester as a research fellow in the program on Theoretical Foundations of Big Data Analysis at the Simons Institute at UC Berkeley, postdoctoral positions at the Institute for Mathematics and its Applications at the University of Minnesota and at ETH Zurich, and 3 years as an assistant professor at IST Austria. She is an elected member of the International Statistical Institute, a Sloan Research Fellow, and she received an NSF Career Award, a Sofja Kovalevskaja Award from the Humboldt Foundation and a START Award from the Austrian Science Foundation. Her research focuses on mathematical statistics and computational biology, in particular on graphical models and causal inference.

