A Regime-Aware Trajectory Prediction Framework for 1000+ Systems Biology Models
Abstract
Predicting long-horizon trajectories of biological dynamical systems remains challenging due to substantial system heterogeneity. Most existing machine learning approaches are system-specific, requiring retraining for each new system and exhibiting limited generalization across distinct biological regimes. To address this limitation, we create a large-scale benchmark of over 1,000 ODE-based systems biology models spanning diverse organisms, biological processes, and dynamical behaviors. Building on this benchmark, we propose a regime-aware trajectory prediction framework that enables cross-system generalization and uncertainty quantification for unseen systems. Our approach introduces structured initial states derived from biological regime priors, such as growth trends and oscillatory rhythms, into conditional flow matching, replacing the standard Gaussian source distribution. We provide theoretical justification for this initialization and empirically demonstrate state-of-the-art accuracy (31\% MAE reduction), well-calibrated uncertainty (17\% CRPS improvement), and efficient long-horizon inference across the benchmark.