When Source Curricula Hurt: A Theory of Curriculum Transfer Under Distribution Shift for Low-Resource Learning
Siddharth Karuturi ⋅ Kaustubh Bukkapatnam
Abstract
Curriculum learning (CL)---training on examples ordered from easy to hard---is widely adopted in low-resource and transfer-learning settings, where practitioners borrow difficulty scorers from high-resource source tasks. Despite its widespread use, no formal theory characterises when such curriculum transfer is beneficial, neutral, or actively harmful. We develop the first comprehensive theoretical framework for curriculum transfer under distribution shift. Our central contributions are: (i) a \textit{Curriculum Transfer Bound} (Theorem 4.1) showing that source curricula achieve near-optimal target performance when the Wasserstein-2 distance between distributions is small; (ii) a \textit{Curriculum Reversal Theorem} (Theorem 4.2) proving that Wasserstein closeness is insufficient---there exist families of distributions where source curricula provably hurt despite vanishing distribution distance; (iii) a scalar \textit{Difficulty Correlation Coefficient} $\alpha(S, T)$ that serves as a sufficient statistic for transfer success (Theorem 4.3), estimable from a small unlabeled target calibration set; and (iv) an \textit{Adaptive Curriculum Algorithm} with sample complexity $\tilde{O}(1/\sqrt{m})$ using $m$ unlabeled target samples (Theorem 4.6). Experiments on eight Global-South-relevant benchmarks---cross-lingual NER for Swahili, Haitian Creole, Tigrinya, and Dhivehi; crop-disease classification from Sub-Saharan Africa; and TB-screening from South Asian datasets---validate all theoretical predictions, with our adaptive method recovering $3.6$--$13.2$ F1/AUC points over harmful source curricula ($\alpha < 0$) at a fraction of the oracle labeling cost.
Successful Page Load