Skip to yearly menu bar Skip to main content


Attention Is All You Need But You Don’t Need All Of It For Inference of Large Language Models

Georgy Tyukin · Gbetondji Dovonon · Jean Kaddour · Pasquale Minervini

Abstract

Chat is not available.