Skip to yearly menu bar Skip to main content


Poster

From generalization analysis to optimization designs for state space models

Fusheng Liu · Qianxiao Li


Abstract:

A State Space Model (SSM) is a foundation model in time series analysis, which has recently been shown as an alternative to transformers in sequence modeling. In this paper, we theoretically study the generalization of SSMs and propose improvements to training algorithms based on thegeneralization results.Specifically, we give a data-dependent generalization bound for SSMs,showing an interplay between the SSM parameters and thetemporal dependencies of the training sequences.Leveraging the generalization bound,we (1) set up a scaling rule for model initialization based on the proposed generalization measure,which significantly improves the robustness of the output value scales on SSMsto different temporal patterns in the sequence data;(2) introduce a new regularization method for training SSMs to enhance the generalization performance.Numerical results are conducted to validate our results.

Live content is unavailable. Log in and register to view live content