Skip to yearly menu bar Skip to main content


Poster

Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference

Piotr Nawrot ⋅ Adrian Łańcucki ⋅ Marcin Chochowski ⋅ David Tarjan ⋅ Edoardo Ponti
2024 Poster

Abstract

Chat is not available.