Skip to yearly menu bar Skip to main content


Spotlight Poster Tue, Jul 15, 2025 • 4:30 PM – 7:00 PM PDT

Position: In-House Evaluation Is Not Enough. Towards Robust Third-Party Evaluation and Flaw Disclosure for General-Purpose AI

Shayne Longpre · Kevin Klyman · Ruth Elisabeth Appel · Sayash Kapoor · Rishi Bommasani · Michelle Sahar · Sean McGregor · Avijit Ghosh · Borhane Blili-Hamelin · Nathan Butters · Alondra Nelson · Amit Elazari · Andrew Sellars · Casey Ellis · Dane Sherrets · Dawn Song · Harley Geiger · Ilona Cohen · Lauren McIlvenny · Madhulika Srikumar · Mark Jaycox · Markus Anderljung · Nadine Johnson · Nicholas Carlini · Nicolas Miailhe · Nik Marda · Peter Henderson · Rebecca Portnoff · Rebecca Weiss · Victoria Westerhoff · Yacine Jernite · Rumman Chowdhury · Percy Liang · Arvind Narayanan

Abstract

Lay Summary

Video

Chat is not available.