Skip to yearly menu bar Skip to main content


Poster

Evaluating LLMs When They Do Not Know the Answer: Statistical Evaluation of Mathematical Reasoning via Comparative Signals

Zihan Dong ⋅ Zhixian Zhang ⋅ Yang Zhou ⋅ Can Jin ⋅ Ruijia Wu ⋅ Linjun Zhang

Abstract

Log in and register to view live content