Skip to yearly menu bar Skip to main content


Poster

Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning

Zhiheng Xi ⋅ Wenxiang Chen ⋅ Boyang Hong ⋅ Senjie Jin ⋅ Rui Zheng ⋅ Wei He ⋅ Yiwen Ding ⋅ Shichun Liu ⋅ Xin Guo ⋅ Junzhe Wang ⋅ Honglin Guo ⋅ Wei Shen ⋅ Xiaoran Fan ⋅ Yuhao Zhou ⋅ Shihan Dou ⋅ Xiao Wang ⋅ Xinbo Zhang ⋅ Peng Sun ⋅ Tao Gui ⋅ Qi Zhang ⋅ Xuanjing Huang
2024 Poster

Abstract

Chat is not available.