Skip to yearly menu bar Skip to main content


Scale AI

Expo Talk Panel

Frontiers in Evaluation, Rewards, and Agent Environments

Lily Gack ⋅ Ying Liu ⋅ Kai Yang ⋅ Yunzhong He

HALL D2
[ ] [ Project Page ]
Sun 5 Jul 4 p.m. PDT — 5 p.m. PDT

Abstract:

As agents move toward real-world tasks with economic impact, evaluation and reward design are becoming increasingly complex. Scale AI will share insights from its research into recent trends at the frontier of evaluation, reward design, and agent environments. In particular, LLM evaluation is critical to model development: it defines the direction of improvement and unlocks RL scaling through automated feedback.

Live content is unavailable. Log in and register to view live content