NaviAgent: Graph‑Driven Bilevel Planning for Scalable Tool Orchestration
Yan Jiang ⋅ HAO ZHOU ⋅ Lizhong Gu ⋅ Tianlong Li ⋅ Ruinan Jin ⋅ Wanqi Zhou ⋅ Ai Han
Abstract
Large Language Models (LLMs) increasingly act as function call agents that invoke external tools to tackle tasks beyond their static knowledge. However, they typically invoke tools one at a time without a global view of task structure. As tools often depend on one another, this leads to error accumulation and poor scalability, particularly when scaling to hundreds or thousands of tools. To address these limitations, we propose NaviAgent, an explicit bilevel architecture that decouples task planning from tool execution through graph‑based modeling of tool relations. At the planning level, the LLM‑based agent decides whether to respond directly, clarify intent, or retrieve and execute a toolchain independent of inter‑tool complexity. At the execution level, a Tool World Navigation Model (TWNM) encodes structural and behavioral relations among tools, steering the agent to compose scalable and robust invocation sequences. Incorporating feedback from real tool interactions, NaviAgent achieves closed‑loop alignment between planning and execution, enabling adaptive navigation in large‑scale tool ecosystems. Evaluations on API-Bank and ToolBench show consistent improvements in task success rate (TSR), with TWNM boosting performance on complex tasks by up to 17 points. Further tests on 50 real APIs across 7 domains confirm a average 10\% improvement in TSR over $\alpha$‑UMI with fewer steps and lower latency, demonstrating robust generalization under real world dynamics.
Successful Page Load