Skip to yearly menu bar Skip to main content


Poster

SparseInfer: Accelerating Large Language Model Inference with Semantics-Inspired Adaptive Sparse Activation

Qinsi Wang ⋅ Saeed Vahidian ⋅ Hancheng Ye ⋅ Jianyang Gu ⋅ Jianyi Zhang ⋅ Yiran Chen

Abstract

Log in and register to view live content