Skip to yearly menu bar Skip to main content


Poster

Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data

Fahim Tajwar ⋅ Anikait Singh ⋅ Archit Sharma ⋅ Rafael Rafailov ⋅ Jeff Schneider ⋅ Tengyang Xie ⋅ Stefano Ermon ⋅ Chelsea Finn ⋅ Aviral Kumar
2024 Poster

Abstract

Chat is not available.