Skip to yearly menu bar Skip to main content


(4 events)   Timezone:  
Show all
Toggle Poster Visibility
Oral
Tue Jul 23 07:30 AM -- 07:45 AM (PDT) @ Straus 1-3 None
Transformers Learn Nonlinear Features In Context: Nonconvex Mean-field Dynamics on the Attention Landscape
Juno Kim · Taiji Suzuki
Oral
Tue Jul 23 07:45 AM -- 08:00 AM (PDT) @ Straus 1-3 None
I/O Complexity of Attention, or How Optimal is FlashAttention?
Barna Saha · Christopher Ye
[ Slides
Oral
Tue Jul 23 08:00 AM -- 08:15 AM (PDT) @ Straus 1-3 None
Improving Transformers with Dynamically Composable Multi-Head Attention
Da Xiao · Qingye Meng · Shengping Li · xingyuan yuan
Oral
Tue Jul 23 08:15 AM -- 08:30 AM (PDT) @ Straus 1-3 None
Less is More: on the Over-Globalizing Problem in Graph Transformers
Yujie Xing · Xiao Wang · Yibo Li · Hai Huang · Chuan Shi