Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Tokenization Workshop (TokShop)

Byte-level Tokenizers Unavoidably Enable LLMs to Generate Ill-formed UTF-8

Preston Firestone · Shubham Ugare · Gagandeep Singh · Sasa Misailovic
2025 Poster
in
Workshop: Tokenization Workshop (TokShop)

Abstract

Chat is not available.