Skip to yearly menu bar Skip to main content


Poster Tue, Jul 7, 2026 • 10:30 AM – 12:15 PM KST HALL A

Confidence and Difficulty-Adaptive Policy Optimization for LLM Reasoning

(Andrew) Zhanke Zhou ⋅ Xiangyu Lu ⋅ Chentao Cao ⋅ Brando Miranda ⋅ Tongliang Liu ⋅ Bo Han ⋅ Sanmi Koyejo

Abstract

Log in and register to view live content