Timezone: »
Discovering interaction effects on a response of interest is a fundamental problem faced in biology, medicine, economics, and many other scientific disciplines. In theory, Bayesian methods for discovering pairwise interactions enjoy many benefits such as coherent uncertainty quantification, the ability to incorporate background knowledge, and desirable shrinkage properties. In practice, however, Bayesian methods are often computationally intractable for even moderate- dimensional problems. Our key insight is that many hierarchical models of practical interest admit a Gaussian process representation such that rather than maintaining a posterior over all O(p^2) interactions, we need only maintain a vector of O(p) kernel hyper-parameters. This implicit representation allows us to run Markov chain Monte Carlo (MCMC) over model hyper-parameters in time and memory linear in p per iteration. We focus on sparsity-inducing models and show on datasets with a variety of covariate behaviors that our method: (1) reduces runtime by orders of magnitude over naive applications of MCMC, (2) provides lower Type I and Type II error relative to state-of-the-art LASSO-based approaches, and (3) offers improved computational scaling in high dimensions relative to existing Bayesian and LASSO-based approaches.
Author Information
Raj Agrawal (MIT)
Brian Trippe (MIT)
Jonathan Huggins (Harvard)
Tamara Broderick (MIT)

Tamara Broderick is an Associate Professor in the Department of Electrical Engineering and Computer Science at MIT. She is a member of the MIT Laboratory for Information and Decision Systems (LIDS), the MIT Statistics and Data Science Center, and the Institute for Data, Systems, and Society (IDSS). She completed her Ph.D. in Statistics at the University of California, Berkeley in 2014. Previously, she received an AB in Mathematics from Princeton University (2007), a Master of Advanced Study for completion of Part III of the Mathematical Tripos from the University of Cambridge (2008), an MPhil by research in Physics from the University of Cambridge (2009), and an MS in Computer Science from the University of California, Berkeley (2013). Her recent research has focused on developing and analyzing models for scalable Bayesian machine learning. She has been awarded selection to the COPSS Leadership Academy (2021), an Early Career Grant (ECG) from the Office of Naval Research (2020), an AISTATS Notable Paper Award (2019), an NSF CAREER Award (2018), a Sloan Research Fellowship (2018), an Army Research Office Young Investigator Program (YIP) award (2017), Google Faculty Research Awards, an Amazon Research Award, the ISBA Lifetime Members Junior Researcher Award, the Savage Award (for an outstanding doctoral dissertation in Bayesian theory and methods), the Evelyn Fix Memorial Medal and Citation (for the Ph.D. student on the Berkeley campus showing the greatest promise in statistical research), the Berkeley Fellowship, an NSF Graduate Research Fellowship, a Marshall Scholarship, and the Phi Beta Kappa Prize (for the graduating Princeton senior with the highest academic average).
Related Events (a corresponding poster, oral, or spotlight)
-
2019 Oral: The Kernel Interaction Trick: Fast Bayesian Discovery of Pairwise Interactions in High Dimensions »
Thu. Jun 13th 12:10 -- 12:15 AM Room Room 101
More from the Same Authors
-
2021 : High-Dimensional Variable Selection and Non-Linear Interaction Discovery in Linear Time »
Raj Agrawal · Tamara Broderick -
2023 : Practical and Asymptotically Exact Conditional Sampling in Diffusion Models »
Brian Trippe · Luhuan Wu · Christian Naesseth · David Blei · John Cunningham -
2023 Poster: Gaussian processes at the Helm(holtz): A more fluid model for ocean currents »
Renato Berlinghieri · Brian Trippe · David Burt · Ryan Giordano · Kaushik Srinivasan · Tamay Özgökmen · Junfei Xia · Tamara Broderick -
2023 Poster: SE(3) diffusion model with application to protein backbone generation »
Jason Yim · Brian Trippe · Valentin De Bortoli · Emile Mathieu · Arnaud Doucet · Regina Barzilay · Tommi Jaakkola -
2021 : High-Dimensional Variable Selection and Non-Linear Interaction Discovery in Linear Time »
Tamara Broderick · Raj Agrawal -
2021 Poster: Finite mixture models do not reliably learn the number of components »
Diana Cai · Trevor Campbell · Tamara Broderick -
2021 Spotlight: Finite mixture models do not reliably learn the number of components »
Diana Cai · Trevor Campbell · Tamara Broderick -
2019 Poster: LR-GLM: High-Dimensional Bayesian Inference Using Low-Rank Data Approximations »
Brian Trippe · Jonathan Huggins · Raj Agrawal · Tamara Broderick -
2019 Oral: LR-GLM: High-Dimensional Bayesian Inference Using Low-Rank Data Approximations »
Brian Trippe · Jonathan Huggins · Raj Agrawal · Tamara Broderick -
2018 Poster: Bayesian Coreset Construction via Greedy Iterative Geodesic Ascent »
Trevor Campbell · Tamara Broderick -
2018 Poster: Minimal I-MAP MCMC for Scalable Structure Discovery in Causal DAG Models »
Raj Agrawal · Caroline Uhler · Tamara Broderick -
2018 Oral: Minimal I-MAP MCMC for Scalable Structure Discovery in Causal DAG Models »
Raj Agrawal · Caroline Uhler · Tamara Broderick -
2018 Oral: Bayesian Coreset Construction via Greedy Iterative Geodesic Ascent »
Trevor Campbell · Tamara Broderick -
2018 Tutorial: Variational Bayes and Beyond: Bayesian Inference for Big Data »
Tamara Broderick