Skip to yearly menu bar Skip to main content


Scan and Snap: Understanding Training Dynamics and Token Composition in 1-layer Transformer

Yuandong Tian ⋅ Yiping Wang ⋅ Beidi Chen ⋅ Simon Du

Abstract

Chat is not available.