Skip to yearly menu bar Skip to main content


Uncertainty-aware Preference Alignment in Reinforcement Learning from Human Feedback

Sheng Xu · Bo Yue · Hongyuan Zha · Guiliang Liu

Abstract

Video

Chat is not available.