Skip to yearly menu bar Skip to main content


Poster

RaBitQCache: Rotated Binary Quantization for KVCache in Long Context LLM Inference

Wenhao Li ⋅ Jinhao Dong ⋅ Hailin Zhang ⋅ Shi ⋅ WEI LU ⋅ Xiaoyong Du

Abstract

Log in and register to view live content