Skip to yearly menu bar Skip to main content


Poster

Smaller Models are Natural Explorers for Policy-Level Diversity in GRPO

Yiming Ren ⋅ Yiran Xu ⋅ Zicheng Lin ⋅ Chufan Shi ⋅ Yukang Chen ⋅ Dingdong WANG ⋅ Tianhe Wu ⋅ Junjie Wang ⋅ Yujiu Yang ⋅ Yu Qiao ⋅ Ruihang Chu

Abstract

Log in and register to view live content