Skip to yearly menu bar Skip to main content


Uplifting Human Decision Making in AI Evaluation by Automating Benchmark Validity Analysis

Rodolfo Corona ⋅ Sang Truong ⋅ Ritwik Gupta ⋅ Nhi N Truong ⋅ Atnafu Lambebo Tonja ⋅ Mena Attia ⋅ Fahim Faisal ⋅ Kaushal K Maurya ⋅ Fred Philippy ⋅ Belu Ticona ⋅ Sumaya Nur Adan ⋅ Fazl Barez ⋅ Omar Florez ⋅ Supheakmungkol Sarin ⋅ Aseem Srivastava ⋅ Xiaoyuan Yi ⋅ Nick Haber ⋅ Dan Klein ⋅ Thamar Solorio ⋅ Xing Xie ⋅ Sanmi Koyejo ⋅ Robert Trager

Abstract

Log in and register to view live content