Skip to yearly menu bar Skip to main content


Poster
in
Workshop: RLxF: RL from World Feedback

MOPD: Multi-Teacher On-Policy Distillation for Capability Integration in LLM Post-Training

Wenhan Ma ⋅ Jianyu Wei ⋅ Liang Zhao ⋅ Hailin Zhang ⋅ Bangjun Xiao ⋅ Lei Li ⋅ Qibin Yang ⋅ Bofei Gao ⋅ Yudong Wang ⋅ Rang Li ⋅ Jinhao Dong ⋅ Fuli Luo ⋅ Zhifang Sui

Abstract

Log in and register to view live content