Poster
in
Workshop: AI for Science: Scaling in AI for Scientific Discovery
TarDis: Achieving Robust and Structured Disentanglement of Multiple Covariates
Kemal Inecik · Aleyna Kara · Antony Rose · Muzlifah Haniffa · Fabian Theis
Keywords: [ Generative Models ] [ self-supervised learning ] [ covariate disentanglement ] [ single-cell genomics ] [ invariant representation learning ] [ latent space optimization ]
Addressing challenges in domain invariance within single-cell genomics necessitates innovative strategies to manage the heterogeneity of multi-source datasets while maintaining the integrity of biological signals. We introduce TarDis, a novel deep generative model designed to disentangle intricate covariate structures across diverse biological datasets, distinguishing technical artifacts from true biological variations. By employing tailored covariate-specific loss components and a self-supervised approach, TarDis effectively generates multiple latent space representations that capture each continuous and categorical target covariate separately, along with unexplained variation. Our extensive evaluations demonstrate that TarDis outperforms existing methods in data integration, covariate disentanglement, and robust out-of-distribution predictions. The model's capacity to produce interpretable and structured latent spaces, including ordered latent representations for continuous covariates, enhances its utility in hypothesis-driven research. Consequently, TarDis offers a promising analytical platform for advancing scientific discovery, providing insights into cellular dynamics, and enabling targeted therapeutic interventions.