Skip to yearly menu bar Skip to main content


Data Mixture Inference Attack: BPE Tokenizers Reveal Training Data Compositions

Jonathan Hayase ⋅ Alisa Liu ⋅ Yejin Choi ⋅ Sewoong Oh ⋅ Noah Smith

Abstract

Chat is not available.