Skip to yearly menu bar Skip to main content


Poster

Beyond Sunk Costs: Boosting LLM Pre-training Efficiency via Orthogonal Growth of Mixture-of-Experts

Ruizhe Wang ⋅ Yucheng Ding ⋅ Xiao Liu ⋅ Yaoxiang Wang ⋅ Peng CHENG ⋅ Baining Guo ⋅ Zheng-Jun Zha ⋅ Yeyun Gong

Abstract

Log in and register to view live content