Skip to yearly menu bar Skip to main content


Oral
in
Workshop: Machine Learning for Multimodal Healthcare Data

InterSynth: a semi-synthetic framework for benchmarking prescriptive inference from observational data

Dominic Giles · Robert Gray · Chris Foulon · Guilherme Pombo · Tianbo Xu · James K Ruffle · Rolf Jäger · Jorge Cardoso · Sebastien Ourselin · Geraint Rees · Ashwani Jha · Parashkev Nachev

Keywords: [ Data sparsity, incompleteness and complexity ] [ Medical Imaging ] [ Electronic healthcare records ] [ Benchmarking, domain shifts, and generalization ]


Abstract:

Treatments are prescribed to individuals in pursuit of contemporaneously unobserved outcomes, based on evidence derived from populations with historically observed treatments and outcomes. Since neither treatments nor outcomes are typically replicable in the same individual, alternatives remain counterfactual in both settings. Prescriptive fidelity therefore cannot be evaluated empirically at the individual-level, forcing reliance on lossy, group-level estimates, such as average treatment effects, that presume an implausibly low ceiling on individuation. The lack of empirical ground truths critically impedes the development of individualised prescriptive models, on which realising personalised care inevitably depends. Here we present InterSynth, a general platform for modelling biologically-plausible, empirically-informed, semi-synthetic ground truths, for the evaluation of prescriptive models operating at the individual level. InterSynth permits comprehensive simulation of heterogeneous treatment effect sizes and variability, and observed and unobserved confounding treatment allocation biases, with explicit modelling of decoupled response failure and spontaneous recovery. Operable with high-dimensional data such as high-resolution brain lesion maps, InterSynth offers a principled means of quantifying the fidelity of prescriptive models across a wide range of plausible real-world conditions. We demonstrate end-to-end use of the platform with an example employing real neuroimaging data from patients with ischaemic stroke, volume image-based succinct lesion representations, and semi-synthetic ground truths informed by functional, transcriptomic and receptomic data. We make our platform freely available to the scientific community.

Chat is not available.