Skip to yearly menu bar Skip to main content


Poster

OnePO: Direct One-stage Policy Optimization for SFT-free Domain Adaptation

Junying Chen ⋅ Xinyuan Xie ⋅ Ziniu Li ⋅ Benyou Wang

Abstract

Log in and register to view live content