VIRUS: Injecting Persistent Cognitive Pathogens into Stateful Zero-Shot Object Navigation Agents
Abstract
Zero-Shot Object Navigation (ZSON) agents rely on continuously updated internal states to support long-horizon planning and decision-making. However, existing methods heavily depend on the observational outputs of vision-language models (VLMs) during state updates and lack explicit validation of perceptual authenticity. This structural vulnerability allows injected adversarial information to transmute into long-term memory, persistently disrupting subsequent planning behaviors. Exploiting this, we propose the Visual-Instruction Recurrent Update Subversion (VIRUS) framework, the first training-free backdoor attack scheme specifically targeting the state update stage of ZSON agents. Upon dual-trigger activation, VIRUS generates velocity-modulated, geometrically consistent adversarial potential fields on the navigable manifold to entrap agents. Crucially, it employs an irreversible state update operator to permanently lock this corruption into memory, effectively subverting intrinsic self-healing mechanisms. Extensive experiments demonstrate that VIRUS achieves extremely high attack success rates across diverse ZSON agents and advanced VLMs. The framework exhibits robust generalization to visual and textual variations and successfully penetrates safety-aligned large model defense mechanisms.