RADE: Unbiased Random Add-Drop Edge as a Regularizer
Abstract
Graph Neural Networks (GNNs) are prone to overfitting and over-squashing of long-range information. Stochastic graph perturbations (e.g., edge/node dropping) regularize training, but often (i) induce train-test mismatch in expected message aggregation, (ii) lack a principled mechanism for random edge addition, (iii) offer limited control over regularization strength, and (iv) require dataset-specific tuning over perturbation hyperparameters. We propose Unbiased Random Add-Drop Edge (RADE), which independently drops edges and adds uniformly sampled non-edges while preserving the expected aggregation at each layer. RADE models graph perturbations as mean-zero logit noise, naturally inducing a variance-weighted regularization penalty. For RADE, we derive a drop/add variance decomposition that clarifies their complementary effects, and propose an epoch-wise GradNorm rule that adaptively tunes the deletion and addition rates during training. We further propose RADE with Inference Correction (RADE-IC), which adds inference-time shortcuts to introduce additional message pathways and mitigate over-squashing. Experiments on node- and graph-classification benchmarks show consistent gains with RADE, while RADE-IC yields notable improvements on over-squashing-prone tasks. Ablations validate the role of unbiased aggregation, GradNorm adaptivity, and drop/add complementarity.