Skip to yearly menu bar Skip to main content


Poster

Single-Head Attention in High Dimensions: A Theory of Generalization, Weights Spectra, and Scaling Laws

Fabrizio Boncoraglio ⋅ Vittorio Erba ⋅ Emanuele Troiani ⋅ Yizhou Xu ⋅ FLORENT KRZAKALA ⋅ Lenka Zdeborova

Abstract

Log in and register to view live content