Skip to yearly menu bar Skip to main content


Poster

DTop-p MoE: Sparsity-Controlled Dynamic Top-p MoE for Foundation Model Pre-training

Can Jin ⋅ Hongwu Peng ⋅ Mingcan Xiang ⋅ Qixin Zhang ⋅ Xiangchi Yuan ⋅ Amit Hasan ⋅ Ohi Dibua ⋅ Yifan Gong ⋅ Yan Kang ⋅ Dimitris Metaxas

Abstract

Log in and register to view live content