Skip to yearly menu bar Skip to main content


Poster

Clipping Low-Probability Tokens in SFT Yields a Generalizable Initialization for RL

Tian-Shuo Liu ⋅ Chengxing Jia ⋅ Haoyu Liu ⋅ Pengyuan Wang ⋅ Shiyuan Zhang ⋅ Jie Fu ⋅ Yang Yu

Abstract

Log in and register to view live content