Spotlight
in
Workshop: The Synergy of Scientific and Machine Learning Modelling (SynS & ML) Workshop
ClimaX: A Foundation Model for Weather and Climate
Tung Nguyen
Recent data-driven approaches based on machine learning aim to directly solve a downstream fore- casting or projection task by learning a data- driven functional mapping using deep neural net- works. However, these networks are trained us- ing curated and homogeneous climate datasets for specific spatiotemporal tasks, and thus lack the generality of currently used physics-informed numerical models for weather and climate mod- eling. We develop and demonstrate ClimaX, a flexible and generalizable deep learning model for weather and climate science that can be trained using heterogeneous datasets spanning different variables, spatiotemporal coverage, and physical groundings. ClimaX extends the Transformer ar- chitecture with novel encoding and aggregation blocks that allow effective use of available com- pute and data while maintaining general utility. ClimaX is pretrained with a self-supervised learn- ing objective on climate datasets derived from CMIP6. The pretrained ClimaX can then be fine- tuned to address a breadth of climate and weather tasks, including those that involve atmospheric variables and spatiotemporal scales unseen during pretraining. Compared to existing data-driven baselines, we show that this generality in Cli- maX results in superior performance on bench- marks for weather forecasting and climate pro- jections. Our source code is available at https: //github.com/microsoft/ClimaX.