Poster
in
Workshop: Next Generation of Sequence Modeling Architectures

Unlocking the Secrets of Linear Complexity Sequence Model from A Unified Perspective

Zhen Qin ⋅ Xuyang Shen ⋅ Dong Li ⋅ Weigao Sun ⋅ Stan Birchfield ⋅ Richard I Hartley ⋅ Yiran Zhong

Project Page [ Poster] [ OpenReview]

Abstract

We present the Linear Complexity Sequence Model (LCSM), a comprehensive solution that unites various sequence modeling techniques with linear complexity, including linear attention, state space model, long convolution, and linear RNN, within a single framework. The goal is to enhance comprehension of these models by analyzing the impact of each component from a cohesive and streamlined viewpoint. Specifically, we segment the modeling processes of these models into three distinct stages: Expand, Oscillation, and Shrink (EOS), with each model having its own specific settings. The Expand stage involves projecting the input signal onto a high-dimensional memory state. This is followed by recursive operations performed on the memory state in the Oscillation stage. Finally, the memory state is projected back to a low-dimensional space in the Shrink stage. We perform comprehensive experiments to analyze the impact of different stage settings on language modeling and retrieval tasks. Our results show that data-driven methods are crucial for the effectiveness of the three stages in language modeling.

Chat is not available.