Skip to yearly menu bar Skip to main content


Regularized Best-of-N Sampling to Mitigate Reward Hacking for Language Model Alignment

Yuu Jinnai · Tetsuro Morimura · Kaito Ariu · Kenshi Abe

Abstract

Video

Chat is not available.