Timezone: »

Synthetic Healthcare Data Generation and Assessment: Challenges, Methods, and Impact on Machine Learning
Ahmed M. Alaa · Mihaela van der Schaar

Mon Jul 19 08:00 AM -- 11:00 AM (PDT) @
Event URL: https://www.vanderschaar-lab.com/ »

In this tutorial we provide an overview of state-of-the-art techniques for synthesizing the two most common types of clinical data; namely tabular (or multidimensional) data and time-series data. In particular we discuss various generative modeling approaches based on generative adversarial networks (GANs) normalizing flows and state-space models for cross-sectional and time-series data demonstrating the use cases of such models in creating synthetic training data for machine learning algorithms and highlighting the comparative strengths and weaknesses of these different approaches. In addition we discuss the issue of evaluating the quality of synthetic data and the performance of generative models; we highlight the challenges associated with evaluating generative models as compared to discriminative predictions and present various metrics that can be used to quantify different aspects of synthetic data quality.

Author Information

Ahmed M. Alaa (UCLA)
Mihaela van der Schaar (University of Cambridge and UCLA)

More from the Same Authors