Skip to yearly menu bar Skip to main content


Poster

Beyond Benchmarks: Toward Causally Faithful Evaluation of Large Language Models

Zhengshuyuan Tian ⋅ Chuanxin Lan ⋅ Chenxi Wang ⋅ Lei Wang ⋅ Guoxin Kang ⋅ Zhengxin Yang ⋅ Yunyou Huang ⋅ Xuehai Hong ⋅ Gao ⋅ Jianfeng Zhan

Abstract

Log in and register to view live content