Skip to yearly menu bar Skip to main content


Poster Thu, Jul 9, 2026 • 5:00 PM – 6:45 PM KST Coex: HALL A

Plan Then Action: High-Level Planning Guidance Reinforcement Learning for LLM Reasoning

Zhihao Dou ⋅ Qinjian Zhao ⋅ Zhongwei Wan ⋅ Zhang Dinggen ⋅ Weida Wang ⋅ Benteng Chen ⋅ Towsif Raiyan ⋅ Qingtao Pan ⋅ Yang Ouyang ⋅ Chaoda Song ⋅ Zhiqiang Gao ⋅ shufei zhang ⋅ Sumon Biswas

Abstract

Log in and register to view live content