What Makes a Desired Graph for Relational Deep Learning?
Abstract
Relational deep learning (RDL) converts relational databases (RDBs) into heterogeneous graphs, but graphs derived directly from database schemas are often not well suited for how graph neural networks (GNNs) perform relational reasoning. We study what makes a relational graph suitable for deep learning and show that schema-derived graphs suffer from two systematic failures: information overload and semantic fragmentation. Through an empirical analysis on real-world databases, we find that effective graphs arise from a task-dependent balance between removing task-irrelevant structure and injecting task-aligned relational connectivity. Filtering exhibits a non-monotonic effect on performance, while structural injection is beneficial only when it reflects the logic of the downstream task. Based on these findings, we develop an end-to-end structural optimizer that applies both operations to adapt relational graphs automatically. Across 23 tasks spanning classification, regression, and recommendation, the optimized graphs consistently improve accuracy while often reducing inference cost. Code and data are available at https://anonymous.4open.science/r/StructuralOptimizerRDL-0F74/.