ProMeCD: Unifying Long-Tailed and Noisy Label Learning via White-Box Control
Abstract
Real-world data is rarely clean; it is plagued by severe class imbalance (long-tailed distributions) and label corruption. Current solutions lean heavily on ''black-box" meta-learning to re-weight samples. However, this paradigm introduces a fatal circular dependency: it relies on pristine, balanced validation sets to guide the optimization, which are essentially non-existent in the wild. We propose ProMeCD, a self-referential framework that breaks this dependency by recasting optimization as an autonomous control problem. Instead of training an opaque neural meta-learner, we employ a transparent proportional-integral controller. The system monitors ''cognitive entropy'' that is a metric derived from von Mises-Fisher gradient statistics to assess learning uncertainty. To resolve the scalar ambiguity between tail and noisy samples, ProMeCD employs a decoupled control strategy: it boosts tail classes via integral accumulation of magnitude deficits when directional consistency is high, while suppressing noise via proportional feedback when consistency collapses. Theoretically, we prove that this mechanism guarantees convergence and formally prevents the minority initial drop, ensuring monotonic improvement for rare classes. Crucially, ProMeCD is fully white-box and validation-free. Experiments on CIFAR-LT, iNaturalist, CIFAR-N, and mini WebVision confirm that ProMeCD is not merely efficient; it outperforms the recent meta-learner FMW-Net by over 10\% in severe imbalance settings, proving that explicit control theory offers a superior path to handling imperfect data.