Skip to yearly menu bar Skip to main content


YC-Bench: Benchmarking AI Agents for Long-Term Planning and Consistent Execution

Muyu He ⋅ Vincent Tu ⋅ Adit Jain ⋅ Anand Kumar ⋅ Sachin Patro ⋅ Soumyadeep Bakshi ⋅ Nazneen Rajani

Abstract

Log in and register to view live content