Timezone: »
Sharpness-Aware Minimization (SAM) has emerged as a promising alternative to stochastic gradient descent (SGD) for minimizing the loss objective in neural network training. The motivation behind SAM is to bias models towards flatter minima that are believed to generalize better. However, recent studies have shown conflicting evidence on the relationship between flatness and generalization, leaving the mechanism behind SAM's performance improvement unclear. In this paper, we present theoretical and empirical evidence that SAM can enhance feature diversity compared to SGD in vision datasets containing redundant or spurious features. We further provide insights into this behavior of SAM by investigating a controlled setting, demonstrating how SAM can induce feature diversity. Our results imply that one mechanism by which SAM improves downstream generalization is by learning representations that rely on more diverse features.
Author Information
Jacob Mitchell Springer (Carnegie Mellon University)
Vaishnavh Nagarajan (Google)
Aditi Raghunathan (Carnegie Mellon University)
More from the Same Authors
-
2021 : Uncovering Universal Features: How Adversarial Training Improves Adversarial Transferability »
Jacob M Springer · Melanie Mitchell · Garrett T Kenyon -
2023 : Complementary Benefits of Contrastive Learning and Self-Training Under Distribution Shift »
Saurabh Garg · Amrith Setlur · Zachary Lipton · Sivaraman Balakrishnan · Virginia Smith · Aditi Raghunathan -
2023 : Why is SAM Robust to Label Noise? »
Christina Baek · Zico Kolter · Aditi Raghunathan -
2023 : TMARS: Improving Visual Representations by Circumventing Text Feature Learning »
Pratyush Maini · Sachin Goyal · Zachary Lipton · Zico Kolter · Aditi Raghunathan -
2023 : Aditi Raghunathan »
Aditi Raghunathan -
2023 Poster: Contextual Reliability: When Different Features Matter in Different Contexts »
Gaurav Ghosal · Amrith Setlur · Daniel S Brown · Anca Dragan · Aditi Raghunathan -
2023 Poster: Automatically Auditing Large Language Models via Discrete Optimization »
Erik Jones · Anca Dragan · Aditi Raghunathan · Jacob Steinhardt