Skip to yearly menu bar Skip to main content


(4 events)   Timezone:  
Show all
Toggle Poster Visibility
Oral
Wed Jul 16 03:30 PM -- 03:45 PM (PDT) @ West Ballroom A None
Position: AI Competitions Provide the Gold Standard for Empirical Rigor in GenAI Evaluation
D. Sculley · William Cukierski · Phil Culliton · Sohier Dane · Maggie Demkin · Ryan Holbrook · Addison Howard · Paul Mooney · Walter Reade · Meg Risdal · Nate Keating
[ OpenReview
Oral
Wed Jul 16 03:45 PM -- 04:00 PM (PDT) @ West Ballroom A None
Position: Medical Large Language Model Benchmarks Should Prioritize Construct Validity
Ahmed Alaa · Thomas Hartvigsen · Niloufar Golchini · Shiladitya Dutta · Frances Dean · Inioluwa Raji · Travis Zack
[ OpenReview
Oral
Wed Jul 16 04:00 PM -- 04:15 PM (PDT) @ West Ballroom A None
Position: Principles of Animal Cognition to Improve LLM Evaluations
Sunayana Rane · Cyrus Kirkman · Graham Todd · Amanda Royka · Ryan Law · Erica Cartmill · Jacob Foster
[ OpenReview
Oral
Wed Jul 16 04:15 PM -- 04:30 PM (PDT) @ West Ballroom A None
Position: Political Neutrality in AI Is Impossible — But Here Is How to Approximate It
Jillian Fisher · Ruth Elisabeth Appel · Chan Young Park · Yujin Potter · Liwei Jiang · Taylor Sorensen · Shangbin Feng · Yulia Tsvetkov · Margaret Roberts · Jennifer Pan · Dawn Song · Yejin Choi
[ OpenReview