Information Geometry Loss for Time Series Forecasting
Abstract
Time series forecasting fundamentally involves learning probability distributions over future observations. However, existing loss functions rely on point-wise Euclidean metrics, neglecting the intrinsic geometric structure of probability distributions. This leads to suboptimal alignment between predicted and true distributions, particularly for uncertainty quantification. We propose InfoGeo Loss, a principled loss function grounded in information geometry that measures distributional discrepancies on statistical manifolds. Our approach comprises three key components: (1) a distribution parameterization module that models predictions with learnable sufficient statistics, (2) a Fisher information metric that quantifies intrinsic distributional distance, and (3) a Bregman divergence component that captures asymmetric prediction errors. We further introduce a natural gradient weighting strategy for efficient optimization on statistical manifolds. Theoretically, we prove statistical consistency and establish convergence guarantees. Extensive experiments on seven datasets with five architectures show that InfoGeo Loss consistently outperforms existing losses, achieving average improvements of 6.8% in MSE and 5.3% in MAE.