Poster
in
Workshop: AI as a Tool for Mathematics, Computer Science, and Machine Learning Thu, Jul 9, 2026 • 7:50 PM – 9:00 PM PDT

SIA-W: Self-Improving Agents with Test-Time Weight Updates

Prannay Hebbar ⋅ Samuel Verboomen ⋅ Selvam Palanimalai ⋅ Yogendra Manawat ⋅ Kunal Bhatia ⋅ Vignesh Baskaran

Project Page

Abstract

We introduce SIA-W, a self-improving agent framework that jointly optimises the agent scaffold and the model weights. Scaffold iteration is a powerful first lever: by evolving tools, prompts, and execution harnesses across generations, the agent rapidly builds domain-adapted search and reasoning procedures. SIA-W then compounds these gains with a second lever, test-time RL, which adapts the model weights directly on task feedback once the scaffold has matured. Across three diverse research tasks spanning law (charge classification), systems (GPU kernel optimisation), and biology (single-cell denoising), combining both levers delivers substantial gains over scaffold iteration alone: 16 percentage points on LawBench, 19% runtime reduction on GPU kernels, and 19% improvement on denoising, with weight updates surfacing domain knowledge that complements what the harness builds.