Skip to yearly menu bar Skip to main content


Poster

rePIRL: Learn PRM with Inverse RL for LLM Reasoning

Xian Wu ⋅ Kaijie Zhu ⋅ Ying Zhang ⋅ Lun Wang ⋅ Wenbo Guo

Abstract

Log in and register to view live content