Skip to yearly menu bar Skip to main content


Poster

Learning Useful Supervision for Reinforcement Learning in Reasoning Models

Liang CHEN ⋅ Xueting Han ⋅ Li Shen ⋅ Jing Bai ⋅ Kam-Fai Wong

Abstract

Log in and register to view live content