On the Rotation-Equivariance Geometry of Tabular Foundation Models
Abstract
Tree-based models often outperform deep tabular models on benchmarks where features carry domain meaning, a phenomenon attributed to axis-alignment in the feature representation. We characterise when tabular foundation models (TFMs) preserve or break feature-rotation symmetry under the orthogonal group O(d), complementing prior work on target-permutation equivariance. We prove that PFN architectures with a row-affine encoder and a d-blind trunk and head are class-level closed under O(d), so the orbit-averaged predictor is O(d)-invariant. Under isotropic priors it additionally dominates the unaveraged predictor in expected risk. Provided a population-level column tokeniser satisfies a basic witness condition (met by standard nonlinear activations), it is generically rotation-variant in the analytic-genericity sense, with the equivariant parameter locus Lebesgue-null and nowhere dense. Across 9 strict-Grinsztajn binary tasks and 6 architectures the predicted class-level separation holds on every task, with median rotation standard deviation 0.0012–0.0025 in the class-invariant band and 0.031–0.044 in the strongly-variant band. A Monte Carlo oracle-Bayes-floor estimator quantifies the irreducible binary cross-entropy of the synthetic priors used here (≈0.39 BCE). Trained models plateau ≈0.22 BCE above this oracle floor, and a single-recipe 3× capacity probe does not close the gap.