Skip to yearly menu bar Skip to main content


Poster

SALE : Low-bit Estimation for Efficient Sparse Attention in Long-context LLM Prefilling

Xiaodong Ji ⋅ Hailin Zhang ⋅ Fangcheng Fu ⋅ Bin Cui

Abstract

Log in and register to view live content