Skip to yearly menu bar Skip to main content


A Practical Two-Stage Recipe for Mathematical LLMs: Maximizing Accuracy with SFT and Efficiency with Reinforcement Learning

Hiroshi Yoshihara · Taiki Yamaguchi · Yuichi Inoue

Abstract

Chat is not available.