Poster
in
Workshop: Next Generation of Sequence Modeling Architectures
Length independent generalization bounds for deep SSM architectures
Dániel Rácz · Mihaly Petreczky · Balint Daroczy
Abstract:
Many state-of-the-art models trained on long-range sequences, for example S4, S5 or LRU, are made of sequential blocks combining State-Space Models (SSMs) with neural networks. In this paper we provide a PAC bound that holds for these kind of architectures with stable SSM blocks and the bound does not depend on the length of the input sequence. Imposing stability of the SSM blocks is a standard practice in the literature, and it is known to help performance. Our results provide a theoretical justification for the use of stable SSM blocks as the proposed PAC bound decreases as the degree of stability of the SSM blocks increases.
Chat is not available.