ICML Sharpness-Aware Minimization Enhances Feature Diversity

Poster
in
Workshop: The Second Workshop on Spurious Correlations, Invariance and Stability

Sharpness-Aware Minimization Enhances Feature Diversity

Jacob Mitchell Springer · Vaishnavh Nagarajan · Aditi Raghunathan

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract:

Sharpness-Aware Minimization (SAM) has emerged as a promising alternative to stochastic gradient descent (SGD) for minimizing the loss objective in neural network training. The motivation behind SAM is to bias models towards flatter minima that are believed to generalize better. However, recent studies have shown conflicting evidence on the relationship between flatness and generalization, leaving the mechanism behind SAM's performance improvement unclear. In this paper, we present theoretical and empirical evidence that SAM can enhance feature diversity compared to SGD in vision datasets containing redundant or spurious features. We further provide insights into this behavior of SAM by investigating a controlled setting, demonstrating how SAM can induce feature diversity. Our results imply that one mechanism by which SAM improves downstream generalization is by learning representations that rely on more diverse features.

Chat is not available.

Poster in Workshop: The Second Workshop on Spurious Correlations, Invariance and Stability

Sharpness-Aware Minimization Enhances Feature Diversity

Jacob Mitchell Springer · Vaishnavh Nagarajan · Aditi Raghunathan

Poster
in
Workshop: The Second Workshop on Spurious Correlations, Invariance and Stability