Skip to yearly menu bar Skip to main content


Poster

SparQ Attention: Bandwidth-Efficient LLM Inference

Luka Ribar ⋅ Ivan Chelombiev ⋅ Luke Hudlass-Galley ⋅ Charlie Blake ⋅ Carlo Luschi ⋅ Douglas Orr
2024 Poster

Abstract

Chat is not available.