Needles in the Haystack: Addressing Signal Dilution Improves scRNA-seq Perturbation Response Modeling and Evaluation
Gabriel Mejia ⋅ Henry Miller ⋅ Francis Leblanc ⋅ BO WANG ⋅ Brendan Swain ⋅ Lucas Paulo de Lima Camillo
Abstract
Recent benchmarks reveal that single-cell perturbation response models are often outperformed by simply predicting the dataset mean. Through large-scale *in silico* simulations, together with analyses of two real-world perturbation datasets, we trace this anomaly to a metric artifact: unweighted error metrics systematically reward mean predictions when perturbation effects are sparse. To address this limitation, we introduce differentially expressed gene (DEG)-aware metrics—weighted mean-squared error (WMSE) and weighted delta $R^{2}$ ($R^{2}_{w}(\Delta)$)—that sensitively measure error in niche, perturbation-specific signals. We further propose explicit negative and positive performance baselines to calibrate these metrics. Under this framework, the mean baseline sinks to null performance, while genuinely informative predictors are correctly rewarded. Finally, we show that using WMSE as a training objective reduces mode collapse and improves predictive performance across multiple model architectures.
Successful Page Load