Skip to yearly menu bar Skip to main content


You're reading LLM leaderboards wrong: Disentangling models from pipelines in engineering benchmarks

Marius Tacke ⋅ Shivam Suri ⋅ Matthias Busch ⋅ Mahish K Guru ⋅ Christian J Cyron ⋅ Roland Aydin

Abstract

Log in and register to view live content