Expo Talk Panel
Frontiers in Evaluation, Rewards, and Agent Environments
Lily Gack ⋅ Ying Liu ⋅ Kai Yang ⋅ Yunzhong He
HALL D2
Abstract:
As agents move toward real-world tasks with economic impact, evaluation and reward design are becoming increasingly complex. Scale AI will share insights from its research into recent trends at the frontier of evaluation, reward design, and agent environments. In particular, LLM evaluation is critical to model development: it defines the direction of improvement and unlocks RL scaling through automated feedback.
Live content is unavailable. Log in and register to view live content
Successful Page Load