Robust Inter-Series Dependency Modeling for Time Series Forecasting via Information-Theoretic Alignment
Abstract
While iTransformer pioneered general inter-variate dependency (IVD) modeling in Transformers for multivariate time series forecasting (MTSF), subsequent research on such universal paradigms has been surprisingly scarce. Through comprehensive analysis, we identify a critical structural inconsistency in Variate Transformers (exemplified by iTransformer): typically capturing inter-variate dependencies via shallow self-attention layers while neglecting the critical requirement for deep-layer IVD modeling, which causes dependency information loss and difficulties in model optimization. To address these limitations, we propose CGTFra, as a general Graph Transformer framework. Specifically, we reconsider existing timestamp-based modeling and introduce a frequency-domain masking and resampling method for periodicity preservation, which serves as a general strategy for input feature enhancement and a substitute for timestamp embeddings. Additionally, CGTFra promotes consistent IVD modeling from two perspectives. Initially, a dynamic graph learning framework is integrated into Transformers to explicitly model IVD in deep network layer. Furthermore, grounded in the Information Bottleneck principle, we further propose a consistency-constrained alignment to learn more robust IVD and temporal feature representations. These three core design philosophies of CGTFra can be integrated into any existing Variate Transformer-based framework, and CGTFra achieves superior predictive performance across 13 long- and short-term datasets with high computational efficiency and desirable interpretability. Code is available at https://anonymous.4open.science/r/CGTFra.