Skip to yearly menu bar Skip to main content


Poster Wed, Jul 16, 2025 • 11:00 AM – 1:30 PM PDT

MATH-Perturb: Benchmarking LLMs' Math Reasoning Abilities against Hard Perturbations

Kaixuan Huang · Jiacheng Guo · Zihao Li · Xiang Ji · Jiawei Ge · Wenzhe Li · Yingqing Guo · Tianle Cai · Hui Yuan · Runzhe Wang · Yue Wu · Ming Yin · Shange Tang · Yangsibo Huang · Chi Jin · Xinyun Chen · Chiyuan Zhang · Mengdi Wang

Abstract

Lay Summary

Video

Chat is not available.