Skip to yearly menu bar Skip to main content


Poster

Towards On-Policy SFT: Distribution Discriminant Theory and its Applications in LLM Training

Miaosen Zhang ⋅ Yishan Liu ⋅ Shuxia Lin ⋅ Qi Dai ⋅ Chong Luo ⋅ Baining Guo ⋅ Weihao Jiang ⋅ Peng Hou ⋅ Anxiang Zeng ⋅ Xu Yang ⋅ Xin Geng

Abstract

Log in and register to view live content