Skip to yearly menu bar Skip to main content


Poster

Stabilizing MoE Reinforcement Learning by Aligning Training and Inference Routers

Wenhan Ma ⋅ Hailin Zhang ⋅ Liang Zhao ⋅ Yifan Song ⋅ Yudong Wang ⋅ Fuli Luo ⋅ Zhifang Sui

Abstract

Log in and register to view live content