Skip to yearly menu bar Skip to main content


Poster

Stable Asynchrony: Variance-Controlled Off-Policy RL for LLMs

Luke Huang ⋅ Zhuoyang Zhang ⋅ Qinghao Hu ⋅ Shang Yang ⋅ Song Han

Abstract

Log in and register to view live content