Skip to yearly menu bar Skip to main content


Poster

Beyond Prediction: Tail-Aware Scheduling for LLM Inference

Yueying Li ⋅ Yuanfan Chen ⋅ Jiayang Chen ⋅ Esha Choukse ⋅ Haoran Qiu ⋅ Edward Suh ⋅ Rodrigo Fonseca ⋅ Ziv Scully ⋅ Udit Gupta

Abstract

Log in and register to view live content