Skip to yearly menu bar Skip to main content


Beyond Markovian: Reflective Exploration via Bayes-Adaptive RL for LLM Reasoning

Shenao Zhang ⋅ Yaqing Wang ⋅ Canoee Liu ⋅ Tianqi Liu ⋅ Peter Grabowski ⋅ Eugene Ie ⋅ Zhaoran Wang ⋅ Yunxuan Li

Abstract

Chat is not available.