Skip to yearly menu bar Skip to main content


Beyond Markovian: Reflective Exploration via Bayes-Adaptive RL for LLM Reasoning

Shenao Zhang · Yaqing Wang · Canoee Liu · Tianqi Liu · Peter Grabowski · Eugene Ie · Zhaoran Wang · Yunxuan Li

Abstract

Chat is not available.