Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Accessible and Efficient Foundation Models for Biological Discovery

Learning Generative Population Models From Multiple Clinical Datasets Via Probabilistic Programming

João Loula · Katie Collins · Ulrich Schaechtle · Josh Tenenbaum · Adrian Weller · Feras Saad · Timothy O'Donnell · Vikash Mansinghka

Keywords: [ oncology ] [ Probabilistic Programming ] [ Bayesian Inference ] [ structure learning ]


Abstract:

Accurate, efficient generative models of clinical populations could accelerate clinical research and improve patient outcomes. For example, such models could infer probable treatment outcomes for different subpopulations, generate high-fidelity synthetic data that can be shared across organizational boundaries, and discover new relationships among clinical variables. Using Bayesian structure learning, we show that it is possible to learn probabilistic program models of clinical populations by combining data from multiple, sparsely overlapping clinical datasets. Through experiments with multiple clinical trials and real-world evidence from census health surveys, we show that our model generates higher quality synthetic data than neural network baselines, supports more accurate inferences across datasets than traditional statistical methods, and can be queried more efficiently than both, opening up new avenues for accessible and efficient AI assistance in clinical research.

Chat is not available.