Minimal I-MAP MCMC for Scalable Structure Discovery in Causal DAG Models
Raj Agrawal · Caroline Uhler · Tamara Broderick

Fri Jul 13th 06:15 -- 09:00 PM @ Hall B #98

Learning a Bayesian network (BN) from data can be useful for decision-making or discovering causal relationships. However, traditional methods often fail in modern applications, which exhibit a larger number of observed variables than data points. The resulting uncertainty about the underlying network as well as the desire to incorporate prior information recommend a Bayesian approach to learning the BN, but the highly combinatorial structure of BNs poses a striking challenge for inference. The current state-of-the-art methods such as order MCMC are faster than previous methods but prevent the use of many natural structural priors and still have running time exponential in the maximum indegree of the true directed acyclic graph (DAG) of the BN. We here propose an alternative posterior approximation based on the observation that, if we incorporate empirical conditional independence tests, we can focus on a high-probability DAG associated with each order of the vertices. We show that our method allows the desired flexibility in prior specification, removes timing dependence on the maximum indegree, and yields provably good posterior approximations; in addition, we show that it achieves superior accuracy, scalability, and sampler mixing on several datasets.

Author Information

Raj Agrawal (MIT)
Caroline Uhler (Massachusetts Institute of Technology)

Caroline Uhler joined the MIT faculty in 2015 as the Henry L. and Grace Doherty assistant professor in the Department of Electrical Engineering and Computer Science and the Institute for Data, Systems, and Society. She holds an MSc in mathematics, a BSc in biology, and an MEd in high school mathematics education from the University of Zurich. She obtained her PhD in statistics, with a designated emphasis in computational and genomic biology, from the University of California, Berkeley. Before joining MIT, she spent a semester as a research fellow in the program on Theoretical Foundations of Big Data Analysis at the Simons Institute at UC Berkeley, postdoctoral positions at the Institute for Mathematics and its Applications at the University of Minnesota and at ETH Zurich, and 3 years as an assistant professor at IST Austria. She is an elected member of the International Statistical Institute, a Sloan Research Fellow, and she received an NSF Career Award, a Sofja Kovalevskaja Award from the Humboldt Foundation and a START Award from the Austrian Science Foundation. Her research focuses on mathematical statistics and computational biology, in particular on graphical models and causal inference.

Tamara Broderick (MIT)

Tamara Broderick is the ITT Career Development Assistant Professor in the Department of Electrical Engineering and Computer Science at MIT. She is a member of the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL), the MIT Statistics and Data Science Center, and the Institute for Data, Systems, and Society (IDSS). She completed her Ph.D. in Statistics at the University of California, Berkeley in 2014. Previously, she received an AB in Mathematics from Princeton University (2007), a Master of Advanced Study for completion of Part III of the Mathematical Tripos from the University of Cambridge (2008), an MPhil by research in Physics from the University of Cambridge (2009), and an MS in Computer Science from the University of California, Berkeley (2013). Her recent research has focused on developing and analyzing models for scalable Bayesian machine learning---especially Bayesian nonparametrics. She has been awarded an NSF CAREER Award (2018), a Sloan Research Fellowship (2018), an Army Research Office Young Investigator Program award (2017), a Google Faculty Research Award, the ISBA Lifetime Members Junior Researcher Award, the Savage Award (for an outstanding doctoral dissertation in Bayesian theory and methods), the Evelyn Fix Memorial Medal and Citation (for the Ph.D. student on the Berkeley campus showing the greatest promise in statistical research), the Berkeley Fellowship, an NSF Graduate Research Fellowship, a Marshall Scholarship, and the Phi Beta Kappa Prize (for the graduating Princeton senior with the highest academic average).

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors