ICML Expo Talk Panel Otter: Generating Tests from Issues to Validate SWE Patches

Expo Talk Panel

Otter: Generating Tests from Issues to Validate SWE Patches

Toufique Ahmed

[ Abstract ]

Abstract:

Recent SWE agents generate code to resolve issues. While great for productivity, such systems make good tests even more important. Unfortunately, most prior work on test generation assumes that the code under test already exists. Instead, we are looking at the case where the code patch that resolves the issue has not yet been written. We introduce Otter, an LLM-based solution for generating tests from issues. Otter augments LLMs with rule-based analysis to check and repair their outputs, and introduces a novel self-reflective action planning stage. As of March 9, 2025, Otter is the SOTA for this scenario, topping the SWT-Bench Verified leaderboard.

Chat is not available.