Neural–Evolutionary Symbolic Regression with Global Constraints: Constraint-Aware Decoding and Reward Shaping
Abstract
Symbolic regression discovers interpretable mathematical expressions from data and is central to scientific modeling. Recent neural approaches typically linearize expression trees into token sequences for sequential generation, but this representation weakens access to the underlying hierarchy and makes it difficult to enforce structure-dependent constraints. Hybrid neural--evolutionary frameworks further combine neural generators with genetic programming (GP), yet training can be unstable due to distribution mismatch between neural samples and GP-refined elites. We propose \textbf{GCN-SR}, a graph-based symbolic regression framework that generates expressions directly in an explicit tree form. GCN-SR introduces \textbf{Symbolic Perfect Binary Trees (SPBTs)}, a fixed-topology scaffold that enables batched tree generation and supports an autoregressive generator based on a Graph Convolutional Network (GCN) while preserving hierarchical structure. To leverage GP refinement without unstable target matching, we further introduce \textbf{Similarity-Weighted Policy Gradient (SWPG)}, which uses GP only to construct similarity-weighted reward signals. Experiments on standard symbolic regression benchmarks, together with extensive ablations, show that GCN-SR consistently outperforms strong neural and hybrid baselines.