Feature Collapse Under Corruption: An Entropy Perspective on Robust Neural Networks
Abstract
Even after decades of advances in neural network training, the inherent robustness challenge remains open. While the sensitivity to adversarial perturbations is understandable given their intentional learning, the most surprising fact is the vulnerability to natural corruptions. Moreover, the reason for this inherent vulnerability remains unknown, and it is not limited to traditional CNNs but also applies to current models, including transformers and large foundation models. For the first time, through this work, we observe that natural corruptions often collapse the network's internal feature space into a high-entropy state, causing predictions to rely on a small subset of fragile features. Inspired by this, we propose a simple yet effective entropy-guided fine-tuning framework, Dem-HEC, that strengthens corruption robustness while maintaining clean accuracy. Our method generates high-entropy samples within a bounded perturbation region. It applies it to both clean and high-entropy samples, combined with knowledge distillation from a teacher snapshot, ensuring stable predictions. The proposed Dem-HEC is effective across datasets ranging from small to large-resolution, from pure CNNs to transformers, and to large foundation models, including DinoV3. The proposed approach outperforms the state-of-the-art (SOTA) models not only in improving robustness but also in retaining or boosting clean accuracy.