Skip to yearly menu bar Skip to main content


Poster Thu, Jul 9, 2026 • 5:00 PM – 6:45 PM KST Coex: HALL A

Evaluating LLMs When They Do Not Know the Answer: Statistical Evaluation of Mathematical Reasoning via Comparative Signals

Zihan Dong ⋅ Zhixian Zhang ⋅ Yang Zhou ⋅ Can Jin ⋅ Ruijia Wu ⋅ Linjun Zhang

Abstract

Log in and register to view live content