Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Workshop on Theoretical Foundations of Foundation Models (TF2M)

Transformer Designs for In-Context Learning in Foundation Models for Time Series Forecasting with Covariates

Afrin Dange · Raj · Praneeth Kumar Netrapalli · Sunita Sarawagi


Abstract:

Recent foundation models (FMs) for time series forecasting (TSF) have shown promising results in zero-shot generalization to new series. However, when time series are associated with input covariates, these models are incapable of modeling series-specific dependence of the forecasted values on the covariates.We identify that historical values in TSF implicitly provide labeled data, which can be leveraged for in-context learning (ICL). While transformers have demonstrated ICL capabilities for regression tasks, when harnessing them as FMs we need to analyze the impact of what constitutes a token in the transformer, the type of attention, and the placement of loss functions during pre-training. We study three existing tokenization schemes for regression tasks in terms of their training convergence and ICL capacity. We propose a modified shifted causal attention designed for faster convergence during pre-training since it allows imposition of next-token loss at multiple positions. Further, it combines the covariates and target such that ICL is achievable for linear regression in just one layer. For time-series data, a popular tokenization method in existing FMs is patching the input series. Our theoretical analysis shows that such tokenization is suboptimal for ICL on time series with covariates.

Chat is not available.