Skip to yearly menu bar Skip to main content


Poster
in
Workshop: The Second Workshop on Spurious Correlations, Invariance and Stability

Why is SAM Robust to Label Noise?

Christina Baek · Zico Kolter · Aditi Raghunathan


Abstract:

Sharpness-Aware Minimization (SAM) has recently been able to achieve state-of-the-art generalization performances in both natural image and language tasks. Previous work has largely tried to understand this generalization performance by characterizing SAM's solutions as lying in ``flat'' (low curvature) regions of the loss landscape. However, others works have also shown that the correlation between various notions of flatness and generalization is weak, raising some doubts about such justification. In this paper, we focus on understanding SAM in the presence of label noise, where the performance gains of SAM are especially pronounced. We first show that SAM's improved generalization properties can already be observed in linear logistic regression, where 1-SAM reduces to simply up-weighting the gradients from correctly labeled points during the early epochs of the training trajectory. Next, we empirically investigate how SAM's learning dynamics change for neural networks, showing similar behavior with regard to how it handles noisy versus clean samples. We conclude that SAM's gains in the label noise setting can largely be explained by how it regularizes the speed at which different examples are learned during the training.

Chat is not available.