Timezone: »

Sharpness-Aware Minimization Enhances Feature Diversity
Jacob Mitchell Springer · Vaishnavh Nagarajan · Aditi Raghunathan
Event URL: https://openreview.net/forum?id=1c1yDwLHat »

Sharpness-Aware Minimization (SAM) has emerged as a promising alternative to stochastic gradient descent (SGD) for minimizing the loss objective in neural network training. The motivation behind SAM is to bias models towards flatter minima that are believed to generalize better. However, recent studies have shown conflicting evidence on the relationship between flatness and generalization, leaving the mechanism behind SAM's performance improvement unclear. In this paper, we present theoretical and empirical evidence that SAM can enhance feature diversity compared to SGD in vision datasets containing redundant or spurious features. We further provide insights into this behavior of SAM by investigating a controlled setting, demonstrating how SAM can induce feature diversity. Our results imply that one mechanism by which SAM improves downstream generalization is by learning representations that rely on more diverse features.

Author Information

Jacob Mitchell Springer (Carnegie Mellon University)
Vaishnavh Nagarajan (Google)
Aditi Raghunathan (Carnegie Mellon University)

More from the Same Authors