Skip to yearly menu bar Skip to main content


Poster

Attention Illuminates LLM Reasoning: The Uncovered Preplan-and-Anchor Rhythm Enables Fine-Grained Policy Optimization

Yang Li ⋅ Zhichen Dong ⋅ Yuhan Sun ⋅ Weixun Wang ⋅ Shaopan Xiong ⋅ Yijia Luo ⋅ Jiashun Liu ⋅ Han Lu ⋅ Jiamang Wang ⋅ Wenbo Su ⋅ Bo Zheng ⋅ Junchi Yan

Abstract

Log in and register to view live content