Skip to yearly menu bar Skip to main content


Fast Inference via Hierarchical Speculative Decoding

Clara Mohri ⋅ Amir Globerson ⋅ Haim Kaplan ⋅ Yishay Mansour ⋅ Tal Schuster

Abstract

Log in and register to view live content