Skip to yearly menu bar Skip to main content


Poster

AugServe: Adaptive Request Scheduling for Augmented Large Language Model Inference Serving

Ying Wang ⋅ Zhen Jin ⋅ Zhenqian Chen ⋅ Jiexiong Xu ⋅ Wenhai Lin ⋅ Yiquan Chen ⋅ Wenzhi CHEN

Abstract

Log in and register to view live content