Skip to yearly menu bar Skip to main content


(4 events)   Timezone:  
Show all
Toggle Poster Visibility
Oral
Thu Jul 17 07:30 AM -- 07:45 AM (KST) @ West Ballroom A None
Position: AI Competitions Provide the Gold Standard for Empirical Rigor in GenAI Evaluation
D. Sculley · William Cukierski · Phil Culliton · Sohier Dane · Maggie Demkin · Ryan Holbrook · Addison Howard · Paul Mooney · Walter Reade · Meg Risdal · Nate Keating
[ OpenReview
Oral
Thu Jul 17 07:45 AM -- 08:00 AM (KST) @ West Ballroom A None
Position: Medical Large Language Model Benchmarks Should Prioritize Construct Validity
Ahmed Alaa · Thomas Hartvigsen · Niloufar Golchini · Shiladitya Dutta · Frances Dean · Inioluwa Raji · Travis Zack
[ OpenReview
Oral
Thu Jul 17 08:00 AM -- 08:15 AM (KST) @ West Ballroom A None
Position: Principles of Animal Cognition to Improve LLM Evaluations
Sunayana Rane · Cyrus Kirkman · Graham Todd · Amanda Royka · Ryan Law · Erica Cartmill · Jacob Foster
[ OpenReview
Oral
Thu Jul 17 08:15 AM -- 08:30 AM (KST) @ West Ballroom A None
Position: Political Neutrality in AI Is Impossible — But Here Is How to Approximate It
Jillian Fisher · Ruth Elisabeth Appel · Chan Young Park · Yujin Potter · Liwei Jiang · Taylor Sorensen · Shangbin Feng · Yulia Tsvetkov · Margaret Roberts · Jennifer Pan · Dawn Song · Yejin Choi
[ OpenReview