Tokenization Workshop (TokShop)
Tomasz Limisiewicz · Valentin Hofmann · Sachin Kumar · Farhan Samir · Jindřich Libovický · Jindřich Helcl · Orevaoghene Ahia · Elizabeth Salesky
Abstract
Tokenization defines how data are represented as input and output for many current machine learning systems, including language models. Tokenization has been shown to significantly affect the utility and effectiveness of these models (Mielke et al., 2021). This finding has stirred considerable interest in tokenization as a research direction in machine learning and its subfields, such as natural language processing, but currently, there is no venue specifically dedicated to it. Our initiative—TokShop (Tokenization Workshop)—aims to fill this gap and will focus on tokenization in a broad sense.
Video
Chat is not available.
Schedule
Timezone: America/Los_Angeles
|
|
|
9:00 AM
|
|
|
|
|
|
|
|
10:50 AM
|
|
10:50 AM
|
|
12:00 PM
|
|
|
|
|
|
|
|
1:50 PM
|
|
1:50 PM
|
|
1:50 PM
|
|
3:00 PM
|
|
|
|
4:30 PM
|
|
5:00 PM
|
Successful Page Load