CoPE: A Framework for Optimizing Coordination between Planning and Execution in LLM-based Agents
Huanxi Liu ⋅ Kun Hu ⋅ Qiang Wang ⋅ Yuanzhao Zhai ⋅ Feng Dawei ⋅ Bo Ding ⋅ Huaimin Wang
Abstract
Fine-tuning Large Language Models (LLMs) as autonomous agents on domain-specific data has emerged as a promising paradigm for tackling interactive, real-world tasks. However, existing studies have overlooked the critical coordination between long-term planning and multi-step execution in optimizing agent capabilities. This oversight leads to the propagation of impractical plans and plan-deviated trajectories into the optimization process, resulting in suboptimal task performance and hindering the further development of LLM-based agents in long-horizon tasks. To bridge this gap, we propose $\textbf{CoPE}$, a novel framework that explicitly integrates planning–execution coordination into LLM-based agent optimization. CoPE employs Self-Refining MCTS to generate task plans and multiple execution trajectories through environment interactions. By quantifying the coordination between planning and execution, CoPE assigns higher optimization weights to well-coordinated samples, enabling LLM-based agents to learn better planning and execution policies. Extensive experiments demonstrate that CoPE substantially improves agent coordination, outperforming state-of-the-art baselines on benchmarks comprising two long-horizon multi-step tasks. Codes and data are available at https://anonymous.4open.science/r/CoPE-F144.
Successful Page Load