Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Next Generation of Sequence Modeling Architectures

SeRpEnt: Selective Resampling for Expressive State Space Models

Stefano Rando · Luca Romani · Matteo Migliarini · Denis Gudovskiy · Luca Franco · Luca Rigazio · Fabio Galasso


Abstract:

State Space Models (SSMs) have enjoyed a rise to prominence in sequence modeling, especially as an alternative to Transformers. The Mamba variant has demonstrated performance comparable to Transformers without any form of attention, thanks to the use of a selective mechanism. Selectivity, however, is only evaluated empirically. In this work, we show that selective time intervals in Mamba act as linear approximators of information. Then, we propose our SeRpEnt architecture, a SSM that further exploits selectivity to compress sequences using a resampling mechanism that aggregates elements based on their information content. Empirical results in the Long Range Arena benchmark and a language modeling task show benefits of the SeRpEnt's resampling mechanism.

Chat is not available.