Skip to yearly menu bar Skip to main content


Poster

Reward and Guidance through Rubrics: Promoting Exploration to Improve Multi-Domain Reasoning

Baolong Bi ⋅ Shenghua Liu ⋅ Yiwei Wang ⋅ Siqian Tong ⋅ Lingrui Mei ⋅ Yuyao Ge ⋅ Yilong Xu ⋅ Jiafeng Guo ⋅ Xueqi Cheng

Abstract

Log in and register to view live content