Skip to yearly menu bar Skip to main content


Poster Tue, Jul 7, 2026 • 10:30 AM – 12:15 PM KST Coex: HALL A

DisPPO: Quantile-Based Distributional Reinforcement Learning for Large Language Models

Zhijian Zhou ⋅ Long Li ⋅ Xuan Zhang ⋅ Zongkai Liu ⋅ Yanting Miao ⋅ Yuchen Liu ⋅ Deshu Chen ⋅ Ke Li ⋅ Xing Sun ⋅ Ruoxi Jiang ⋅ Xiaoyu Tan ⋅ Chao Qu ⋅ Yuan Qi

Abstract

Log in and register to view live content