Common Task Framework For a Critical Evaluation of Scientific Machine Learning Algorithms
Abstract
To address the problem of rapid development outpacing the creation of standardized, objective benchmarks, we propose a Common Task Framework (CTF) for scientific machine learning. The CTF features a curated set of datasets and task-specific metrics spanning forecasting, state reconstruction, and generalization under realistic constraints, including noise and limited data. Inspired by the success of CTFs in fields like natural language processing and computer vision, our framework provides a structured, rigorous foundation for head-to-head evaluation of diverse algorithms. Our open-source framework enables researchers to rapidly implement, test, and optimize their models against our datasets to support our long-term vision to raise the bar for rigor and reproducibility in scientific ML.