Skip to yearly menu bar Skip to main content


Poster

LO-BCQ: Locally Optimal Block Clustered Quantization for 4-bit (W4A4) LLM Inference

Reena Elangovan ⋅ Charbel Sakr ⋅ Anand Raghunathan ⋅ Brucek Khailany

Abstract

Log in and register to view live content