Skip to yearly menu bar Skip to main content


Act Only When It Pays: Efficient Reinforcement Learning for LLM Reasoning via Selective Rollouts

Haizhong Zheng · Yang Zhou · Brian Bartoldson · Bhavya Kailkhura · Fan Lai · Jiawei Zhao · Beidi Chen

Abstract

Chat is not available.