StarEmbed: Benchmarking Time Series Foundation Models on Astronomical Observations of Variable Stars
Weijian Li ⋅ Hong-Yu Chen ⋅ Nabeel Rehemtulla ⋅ Ved Shah ⋅ Dongho Kim ⋅ Dennis Wu ⋅ Qinjie Lin ⋅ Adam Miller ⋅ Han Liu
Abstract
Time series foundation models (TSFMs) are increasingly adopted as general-purpose time series learners. Although their training corpora are vast, they exclude peta-scale astronomical time series that exhibit unique challenges (e.g., irregular sampling, multiple variates, and heteroskedasticity) and exist in immense quantities. We introduce $\texttt{StarEmbed}$, the first public benchmark for stellar time series observations ("light curves") on three downstream tasks: unsupervised clustering, supervised classification, and out-of-distribution (OOD) source detection. $\texttt{StarEmbed}$ integrates a catalog of expert-vetted light curves, totaling $\sim$40,000 labeled examples across seven astrophysical classes. We evaluate the zero-shot capabilities of three families of TSFMs ($\texttt{Moirai}$, $\texttt{Chronos}$, and $\texttt{Time-MoE}$) and a domain-specific transformer ($\texttt{Astromer}$). Our results demonstrate that the $\texttt{Chronos}$ family, despite being pre-trained on regularly sampled data, outperforms domain-specific baselines and yields state-of-the-art performance in clustering and OOD source detection. While they do not yet strictly surpass hand-crafted features in classification, TSFM models like the Chronos models demonstrate excellent generalization performance, marking a promising step toward universal foundation models in astronomy.
Successful Page Load