Skip to yearly menu bar Skip to main content


Poster

$\textit{S}$-SPPO: Semantic-Calibrated Self-Play Preference Optimization

Xiwen Chen ⋅ Wenhui Zhu ⋅ Jingjing Wang ⋅ Peijie Qiu ⋅ Zhipeng Wang ⋅ Huayu Li ⋅ ZhengXiao He ⋅ XUANZHAO DONG ⋅ Prayag Tiwari ⋅ Mingkun Xu ⋅ Yujian Xiong ⋅ Feng Luo ⋅ Abolfazl Razi ⋅ Brendan Rappazzo ⋅ Anderson Schneider ⋅ Yuriy Nevmyvaka

Abstract

Log in and register to view live content