Skip to yearly menu bar Skip to main content


Regularized Best-of-N Sampling to Mitigate Reward Hacking for Language Model Alignment

Yuu Jinnai ⋅ Tetsuro Morimura ⋅ Kaito Ariu ⋅ Kenshi Abe

Abstract

Video

Chat is not available.