Deliberate Evolution for Sample-Efficient Symbolic Regression with LLM
Abstract
Symbolic regression (SR) stands as a cornerstone of scientific discovery, deriving mathematical expressions from observing data. Recent advances incorporate large language models (LLMs) into evolutionary optimization, typically relying on iterative refinement driven by scalar feedback (e.g., mean squared error, MSE). However, such coarse feedback lacks directional guidance for strategic lookahead and diagnostic signals to localize structural errors, thereby confining the search to a myopic trial-and-error process. Additionally, treating optimization steps as isolated episodes precludes learning from historical trajectories. Consequently, optimization often degenerates into an inefficient search with substantial computational cost. Motivated by these limitations, we propose Deliberate Evolution, an agentic framework for SR tasks that equips LLM-based candidate proposal with explicit, structured guidance. Our approach steers optimization through adaptive evolutionary operators for directional control, analytical tools for diagnostic feedback, and reflective memory for historical insight. Extensive experiments on LLM-SRBench demonstrate that our approach consistently outperforms prior baselines while using merely 40\% of the sample budget.