The Interplay of Harness Design and Post-Training in LLM Agents
Kyungmin Kim ⋅ Youngbin Choi ⋅ Seoyeon Lee ⋅ Suhyeon Jun ⋅ Dongwoo Kim ⋅ Sangdon Park
Abstract
Tool-integrated LLM agents are typically studied with the *harness*, the scaffolding that determines which tools are exposed, how they are described, and what auxiliary information accompanies each per-step observation. This scaffolding is treated either as a fixed engineering detail or as a knob tuned only in the training-free regime. How harness design interacts with post-training is left largely unexamined, even though deployed agents are routinely post-trained and routinely face shifts in their tasks and tool environments. We study this interaction directly. Building a benchmark on top of $\texttt{ALFWorld}$, we expose three independently controllable design dimensions (tool schema, task type, and harness) and evaluate agents across four regimes: zero-shot, in-distribution post-training, task shift, and tool environment shift, each run under three harness levels. Our experiments show that harness and post-training are not separable design choices. Harness richness alone drives large zero-shot gaps, the harness used during post-training reshapes both in-distribution success and out-of-distribution robustness, and a harness applied only after training recovers little of the benefit of training with it in place. For the smaller model, a harness is in fact a prerequisite for post-training to take effect at all. These results indicate that the harness should be treated as a first-class design dimension, chosen jointly with post-training rather than after it.
Successful Page Load