Skip to yearly menu bar Skip to main content


Generating Efficient Kernels for Quantized Inference on Large Language Models

Tommaso Pegolotti · Elias Frantar · Dan Alistarh · Markus Püschel

Abstract

Video

Chat is not available.