Skip to yearly menu bar Skip to main content


Poster

Scaling up Multi-Turn Off-Policy RL and Multi-Agent Tree Search for LLM Step-Provers

Ran Xin ⋅ Zeyu Zheng ⋅ Yanchen Nie ⋅ Kun Yuan ⋅ Xia Xiao

Abstract

Log in and register to view live content