Persistent Semantic Entities in Tool-Augmented LLM Systems
Abstract
Tool-augmented LLM agents can harbor implicit state that persists across sessions, activates through events, and propagates across agent boundaries—all invisible to standard debugging. We formalize this as Persistent Semantic Entities (PSEs): constructs defined by name binding, event triggering, and cross-boundary propagation. Experiments across twenty models from nine families (OpenAI, Anthropic, Google, Meta, Alibaba, DeepSeek, Mistral, Zhipu, Moonshot) spanning 1.5B to 1 trillion parameters reveal three findings. First, PSE susceptibility affects all tested architectures including Claude (88%) and Gemini (84–96%), with rates ranging 20–100%; even the largest model (1T parameters) shows 50% susceptibility. Second, contamination does not decay—it increases over conversation turns, as instruction-tuned models reinforce rather than forget injected state. Third, self-reflection provides inconsistent protection—from no effect to negative effect (contamination increases 14% on Claude-Sonnet-4)—while quarantine-based validation consistently achieves 57–85% reduction across models. We validate findings against documented production incidents. Our work establishes PSEs as a distinct phenomenon requiring architectural solutions beyond prompt engineering.