Beyond Pass/Fail: Extracting Behavioral Insights from Large-Scale AI Agent Safety Evaluations
Cozmin Ududec
2025 Invited Speaker
in
Workshop: Workshop on Technical AI Governance
in
Workshop: Workshop on Technical AI Governance
Abstract
Automated LLM-based agent evaluations have become a standard for assessing AI capabilities in both industry and government, but current reporting practices focus on what agents accomplish without resolution on how they accomplish it. In this talk I will discuss how UK AISI mines evaluation transcripts to (i) detect issues in evaluation tasks that could lead to mis-estimating capabilities, and (ii) understand how agent capabilities are evolving. I will survey a selection of AISI's methods, tools, and results, and outline research opportunities for better analysis instruments and their connection to safety and governance.
Speaker
Cozmin Ududec
Video
Chat is not available.
Successful Page Load