Skip to yearly menu bar Skip to main content


Poster
in
Workshop: RLxF: RL from World Feedback

RL Excursions during Pre-training: How early is too early for on-policy learning?

Rachit Bansal ⋅ Clara Mohri ⋅ Tian Qin ⋅ David Alvarez-Melis ⋅ Sham Kakade

Abstract

Log in and register to view live content