Skip to yearly menu bar Skip to main content


Poster
in
Workshop: RLxF: RL from World Feedback

A Few Teacher Steps Go a Long Way: Cost-Efficient On-Policy Data Augmentation for Agent Post-Training

Junze Ye ⋅ Jiayi Cheng ⋅ Lu Miao ⋅ Michal Mankowski ⋅ Jose Blanchet ⋅ Mohsen Bayati

Abstract

Log in and register to view live content