Poster
in
Workshop: ES-FoMo II: 2nd Workshop on Efficient Systems for Foundation Models
Janus: An Efficient and Expressive Subquadratic Architecture for Modeling Biological Sequences
Krithik Ramesh · Sameed Siddiqui · Michael Mitzenmacher · Pardis Sabeti
Deep learning tools such as convolutional neural networks (CNNs) and transformers have spurred great advancements in computational biology. However, existing methods are constrained architecturally in context length, computational complexity, and model size. This paper introduces Janus, a sub-quadratic architecture for sequence modeling, which combines projected gated convolutions and structured state spaces to achieve local and global context with single-nucleotide resolution. Janus outperforms CNN-, GPT-, BERT-, and long convolution-based models in many tested genomics tasks without pre-training and with 4x-781x fewer parameters. In the proteomics domain, Janus similarly outperforms pre-trained attention-based models, including ESM-1B and TAPE-BERT, on remote homology prediction without pre-training and while using 3,308x-23,636x fewer parameters. Janus couples these performance improvements with reduced wall-clock times, showing up to 50x speed-up compared to ESM1b and 7x speed-up compared to DistilProtBert for sequences of length up to 16,384.