TeamTR: Trust-Region Fine-Tuning for Multi-Agent LLM Coordination
Abstract
Multi-agent LLM systems can improve reasoning and tool use, yet recent evidence shows their gains are often unstable and sensitive to interaction design. A promising direction is to \emph{train} collaboration, but team post-training introduces a moving-target effect: when agents interact through a shared context, updating one agent shifts the context distribution faced by the others, which can regress coordination under naive sequential updates. We propose \textit{TeamTR}, a trust-region framework for fine-tuning heterogeneous LLM teams that explicitly controls this \emph{occupancy shift}. TeamTR evaluates each agent update on rollouts from the \emph{intermediate} team induced by partially applied updates, and enforces per-agent trust regions via a token-decomposed \emph{reverse} KL that is directly monitorable from those rollouts. This yields population-level per-update and per-stage \emph{improvement lower bounds} whose functional form applies to any realized update order, and motivates a practical certificate \emph{proxy} computed from logged surrogates and KL terms. We instantiate TeamTR for router-based text handoff with sequence-level returns and bounded group-normalized advantages, and show empirically that it mitigates coordination regressions, improves training stability across heterogeneous teams, and supports modular component replacement via a trust-region alignment step.