Skip to yearly menu bar Skip to main content


Benchmarks Are Not Atomic: Composition-Aware LLM Evaluation using BenchHub

Eunsu Kim ⋅ Haneul Yoo ⋅ Guijin Son ⋅ Hitesh Patel ⋅ Amit Agarwal ⋅ Alice Oh

Abstract

Log in and register to view live content