ICML Janus: An Efficient and Expressive Subquadratic Architecture for Modeling Biological Sequences

Poster
in
Workshop: ES-FoMo II: 2nd Workshop on Efficient Systems for Foundation Models

Janus: An Efficient and Expressive Subquadratic Architecture for Modeling Biological Sequences

Krithik Ramesh · Sameed Siddiqui · Michael Mitzenmacher · Pardis Sabeti

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract:

Deep learning tools such as convolutional neural networks (CNNs) and transformers have spurred great advancements in computational biology. However, existing methods are constrained architecturally in context length, computational complexity, and model size. This paper introduces Janus, a sub-quadratic architecture for sequence modeling, which combines projected gated convolutions and structured state spaces to achieve local and global context with single-nucleotide resolution. Janus outperforms CNN-, GPT-, BERT-, and long convolution-based models in many tested genomics tasks without pre-training and with 4x-781x fewer parameters. In the proteomics domain, Janus similarly outperforms pre-trained attention-based models, including ESM-1B and TAPE-BERT, on remote homology prediction without pre-training and while using 3,308x-23,636x fewer parameters. Janus couples these performance improvements with reduced wall-clock times, showing up to 50x speed-up compared to ESM1b and 7x speed-up compared to DistilProtBert for sequences of length up to 16,384.

Chat is not available.

Poster in Workshop: ES-FoMo II: 2nd Workshop on Efficient Systems for Foundation Models

Janus: An Efficient and Expressive Subquadratic Architecture for Modeling Biological Sequences

Krithik Ramesh · Sameed Siddiqui · Michael Mitzenmacher · Pardis Sabeti

Poster
in
Workshop: ES-FoMo II: 2nd Workshop on Efficient Systems for Foundation Models