Skip to yearly menu bar Skip to main content


Poster

Revisiting Efficiency–Accuracy Scaling in Mixture-of-Experts Architectures

Venmugil Elango ⋅ Nidhi Bhatia ⋅ Roger Waleffe ⋅ Rasoul Shafipour ⋅ Tomer Asida ⋅ Abhinav Khattar ⋅ Nave Assaf ⋅ Maximilian Golub ⋅ Joseph Guman ⋅ Tiyasa Mitra ⋅ Ritchie Zhao ⋅ Ritika Borkar ⋅ Ran Zilberstein ⋅ Mostofa Patwary ⋅ Mohammad Shoeybi ⋅ Bita Darvish Rouhani

Abstract

Log in and register to view live content