Skip to yearly menu bar Skip to main content


Uncertainty-aware Preference Alignment in Reinforcement Learning from Human Feedback

Sheng Xu ⋅ Bo Yue ⋅ Hongyuan Zha ⋅ Guiliang Liu

Abstract

Video

Chat is not available.