Poster
in
Workshop: Workshop on Theoretical Foundations of Foundation Models (TF2M)

MSAMamba: Adapting Subquadratic Models To Long-Context DNA MSA Analysis

Vishrut Thoutam · Dina Ellsworth

Project Page [ OpenReview]

Abstract

We introduce MSAMamba, a novel architecture designed to address the context-length limitation of existing transformer-based models for DNA multiple sequence alignments (MSAs). Traditional transformers struggle with the vast context lengths inherent in MSA genome data, mainly due to the quadratic complexity of self-attention at large batch sizes. MSAMamba leverages a selective scan operation along the sequence dimension and separates sequence length and MSA dimension processing to enhance efficiency. This architecture enables scalable analysis of long DNA sequences, increasing the training context length of previous methods by 8x. In addition, we develop a row-sparse training method that significantly reduces the computational overhead of the selective scan. We demonstrate that MSAMamba achieves performance on par with state-of-the-art (SOTA) transformer-based models in variant effect prediction tasks and exceeds their performance at longer context lengths. We also demonstrate that MSAMamba excels in long-context GenomicBenchmarks tasks. Our results indicate that MSAMamba mitigates the computational challenges of long-context DNA MSA analysis and sets a new standard for scalability and efficiency in genomic modeling.

Chat is not available.