Skip to yearly menu bar Skip to main content


Poster

The Structural Origin of Attention Sink: Variance Discrepancy, Super Neurons, and Dimension Disparity

Siquan Li ⋅ Kaiqi Jiang ⋅ Jiacheng Sun ⋅ Tianyang Hu

Abstract

Log in and register to view live content