PDAgent: An LLM-Driven Autonomous Agent Framework Towards *In Silico* Protein Design via Directed Mutation
Abstract
Computational protein design holds immense promise across diverse domains, but existing approaches face significant challenges: traditional physics-based methods require substantial domain expertise, while emerging deep learning methods often rely on restricted functional ontologies, struggle to bridge the semantic gap between text and protein sequences, or lack closed-loop optimization mechanisms. In this paper, we present PDAgent, an LLM-driven autonomous agent framework that enables in silico protein design through template-based directed mutation. Our framework accepts natural language specifications of desired protein properties and employs a ReAct-style reasoning loop comprising five phases: THINK, PLAN, ACT, OBSERVE, and REFLECT. PDAgent integrates template retrieval, conservation-aware mutation strategies, and domain-specific computational tools for property optimization across eight biophysical dimensions. Experiments on 100 diverse protein design tasks demonstrate that PDAgent achieves a 91.34% average constraint satisfaction rate with high structural quality (mean pLDDT 87.69), substantially outperforming both direct LLM generation and specialized deep learning methods.