Skip to yearly menu bar Skip to main content


Aligning Crowd Feedback via Distributional Preference Reward Modeling

Dexun Li ⋅ Cong Zhang ⋅ Kuicai Dong ⋅ Derrick Goh Xin Deik ⋅ Ruiming Tang ⋅ Yong Liu

Abstract

Video

Chat is not available.