Skip to yearly menu bar Skip to main content


Poster

Gumiho: A Hybrid Architecture to Prioritize Early Tokens in Speculative Decoding

Jinze Li · Yixing Xu · Haiduo Huang · Xuanwu Yin · Dong Li · Edith Ngai · Emad Barsoum
2025 Poster

Abstract

Lay Summary

Video

Chat is not available.