Geometry-Aware Tabular Diffusion
Abstract
Tabular data synthesis is critical for privacy-preserving data sharing and augmentation, yet existing diffusion models rely on implicit attention mechanisms to capture inter-column relationships. We introduce Geometry-Aware Tabular Diffusion, which augments diffusion models with explicit pairwise geometric features - angles and lengths - computed directly from column value differences. Our method achieves state-of-the-art performance on standard benchmarks while using 3.5 times fewer parameters on average (up to 25 times for classification tasks) than transformer-based approaches. On ten datasets, we win on 8/10 for Shape (marginal fidelity) with 27% error reduction, 7/10 for Trend (correlation preservation) with 20% error reduction, and 9/10 for downstream utility (F1/RMSE). These results demonstrate that explicit relational structure can substitute for model capacity, enabling state-of-the-art tabular synthesis with simple, efficient architectures.