Skip to yearly menu bar Skip to main content


Poster

AdaRoPE: Not All Attention Heads Should Rotate and Scale Equally

Shaowen Wang ⋅ Yuke Zheng ⋅ Tansheng Zhu ⋅ Shuang Chen ⋅ Shaofan Liu ⋅ Suncong Zheng ⋅ Li Jian

Abstract

Log in and register to view live content