Skip to yearly menu bar Skip to main content


Poster

TruthRL: Incentivizing Truthful LLMs via Reinforcement Learning

Zhepei Wei ⋅ Xiao Yang ⋅ Kai Sun ⋅ Jiaqi Wang ⋅ Rulin Shao ⋅ Jingxiang Chen ⋅ Mohammad Kachuee ⋅ Teja Gollapudi ⋅ Yiwei Liao ⋅ Nicolas SCHEFFER ⋅ Rakesh Wanga ⋅ Anuj Kumar ⋅ Yu Meng ⋅ Scott Yih ⋅ Xin Dong

Abstract

Log in and register to view live content