Skip to yearly menu bar Skip to main content


Poster

FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU

Ying Sheng · Lianmin Zheng · Binhang Yuan · Zhuohan Li · Max Ryabinin · Beidi Chen · Percy Liang · Christopher Re · Ion Stoica · Ce Zhang
2023 Poster
[ Poster

Abstract

Video

Chat is not available.