Skip to yearly menu bar Skip to main content


Poster

Multimodal Prototyping for cancer survival prediction

Andrew Song · Richard Chen · Guillaume Jaume · Anurag Vaidya · Alexander Baras · Faisal Mahmood


Abstract: Multimodal survival methods combining histology whole-slide images (WSIs) and transcriptomic profiles are particularly promising for patient prognostication and stratification. Current approaches involve tokenizing the WSIs into histology patches ($>10^4$) and transcriptomics into gene groups, which are then integrated using a Transformer for predicting outcomes. However, this process generates many tokens, which leads to high memory requirements for computing attention and complicates post-hoc interpretability analyses. Instead, we hypothesize that we can: (1) effectively summarize the morphological content of a WSI by condensing its constituting tokens using morphological prototypes with a Gaussian mixture model, achieving more than $300\times$ compression; and (2) accurately characterize cellular functions by encoding the transcriptomic profile with biological pathway prototypes, all in an unsupervised fashion. The resulting multimodal tokens are then processed by a fusion network, either with a Transformer or an optimal transport cross-alignment, which now operates with a small and fixed number of tokens without approximations, not possible in previous fusion frameworks for cancer survival prediction. Evaluation on six cancer types shows that our framework significantly outperforms state-of-the-art methods with much less computation, while unlocking new interpretability analyses.

Live content is unavailable. Log in and register to view live content