Poster Thu, Jul 9, 2026 • 2:30 PM – 4:15 PM KST Coex: HALL A

Is Spurious Correlation Removal Always Learnable?

Yibo Zhou ⋅ Bo Li ⋅ Hai-Miao Hu ⋅ Hanzi Wang ⋅ Xiaokang Zhang ⋅ Ruifan Zhang

Abstract

Invariant learning can fail even when the invariant structure is statistically identifiable. We show an inherent computational barrier: under the Planted Clique hypothesis, there exist samplable linear-Gaussian multi-environment instances with a one-dimensional invariant subspace ($k=1$) that are learnable with polynomial samples by exhaustive search but intractable for any polynomial-time algorithm, via an average-case reduction from a supervised sparse primitive. We further quantify environment diversity by a separation parameter $\gamma$, which controls identifiability and the curvature of invariance objectives. Under sufficient diversity, the minimax risk is $\mathbb{E}[dist(\hat{V},V_{\mathrm{inv}})^2]=\Theta(k(d-k)/(n|\mathcal{E}|))$, and under label-induced shifts a phase transition occurs at $n^*\propto k(d-k)/(|\mathcal{E}|\gamma^2)$. Synthetic and real datasets validate the predicted gaps and transitions and motivate simple diversity diagnostics.