Skip to yearly menu bar Skip to main content


Search All 2024 Events
 

95 Results

<<   <   Page 1 of 8   >   >>
Poster
Wed 2:30 A Fine-grained Analysis of Fitted Q-evaluation: Beyond Parametric Models
Jiayi Wang · Zhengling Qi · Raymond K. W. Wong
Oral
Wed 7:30 Automated Evaluation of Retrieval-Augmented Language Models with Task-Specific Exam Generation
Gauthier Guinet · Behrooz Tehrani · Anoop Deoras · Laurent Callot
Poster
Thu 4:30 Stability Evaluation through Distributional Perturbation Analysis
Jose Blanchet · Peng Cui · Jiajin Li · Jiashuo Liu
Expo Talk Panel
Automated Evaluation of LLM responses
P Aditya Sreekar · Sahil Verma · Surnash Chopra
Expo Talk Panel
Sun 5:30 Automated Evaluation of LLM responses
Abhishek Persad · Akash Gupta
Oral
Wed 2:15 Evaluation of LLMs on Syntax-Aware Code Fill-in-the-Middle Tasks
Linyuan Gong · Sida Wang · Mostafa Elhoushi · Alvin Cheung
Poster
Thu 4:30 Evaluating Instrument Validity using the Principle of Independent Mechanisms
Patrick F. Burauel
Poster
Wed 2:30 Beyond ELBOs: A Large-Scale Evaluation of Variational Methods for Sampling
Denis Blessing · Xiaogang Jia · Johannes Esslinger · Francisco Vargas · Gerhard Neumann
Poster
Wed 2:30 Evaluation of LLMs on Syntax-Aware Code Fill-in-the-Middle Tasks
Linyuan Gong · Sida Wang · Mostafa Elhoushi · Alvin Cheung
Poster
Wed 4:30 Automated Evaluation of Retrieval-Augmented Language Models with Task-Specific Exam Generation
Gauthier Guinet · Behrooz Tehrani · Anoop Deoras · Laurent Callot
Poster
Thu 2:30 InfiAgent-DABench: Evaluating Agents on Data Analysis Tasks
Xueyu Hu · Ziyu Zhao · Shuang Wei · Ziwei Chai · Qianli Ma · Guoyin Wang · Xuwu Wang · Jing Su · Jingjing Xu · Ming Zhu · Yao Cheng · Jianbo Yuan · Jiwei Li · Kun Kuang · Yang Yang · Hongxia Yang · Fei Wu
Poster
Thu 2:30 Kernel-Based Evaluation of Conditional Biological Sequence Models
Pierre Glaser · Steffan Paul · Alissa M. Hummer · Charlotte Deane · Debora Marks · Alan Amin