Skip to yearly menu bar Skip to main content


Generating Efficient Kernels for Quantized Inference on Large Language Models

Tommaso Pegolotti ⋅ Elias Frantar ⋅ Dan Alistarh ⋅ Markus Püschel

Abstract

Video

Chat is not available.