Spotlight
in
Workshop: AI for Science: Scaling in AI for Scientific Discovery
Deep learning virtual screening with active signature learning improves the identification of small-molecule modulators of complex phenotypes
Daniel Burkhardt · Inna Lipchina · Samuel Miller · Benjamin Demeo · Doris Fu · peter holderreith · David Kim · Ishan Gupta · Charlotte Nesbitt · Thomas Pfefer · Raziel Rojas-Rodriguez · Zamanighomi · Mauricio Cortes · Alex Shalek · Fabian Theis
Keywords: [ Drug discovery ] [ Computational Biology ] [ transcriptomics ] [ genomics ] [ single-cell ]
Phenotypic drug discovery holds promise for developing new medicines but is limited by throughput and scalability. Current application of AI to improve screening efficiency relied on single-use models trained on a phenotype-specific high throughput screen. We introduce a generalizable deep learning framework leveraging omics data to prioritize compounds for virtually any phenotype using a single model. We also developed a novel closed-loop active signature learning procedure to optimize the omics signature associated with a target phenotype. We trained our model on over 425,000 perturbation signatures and validated it using a new single-cell transcriptomics benchmark dataset profiling 88 perturbations across 10 cell lines. Our approach outperformed published methods by 15-80\% and led to a 16-19X increase in productivity in two hematology phenotypic discovery campaigns, providing the first experimental validation that deep learning and omics data can improve the productivity of phenotypic discovery in a real-world setting. We next demonstrated the ability of our active signature learning algorithm to refine hit compound prioritization and gain mechanistic insights through an integrative lab-in-the-loop framework. This approach enables rational drug design targeting complex phenotypes, ushering in a new era of drug discovery.