Adaptive Sharpness-Aware Minimization with a Polyak-type Step size: A Theory-Grounded Scheduler
Dimitris Oikonomou ⋅ Nicolas Loizou
Abstract
Sharpness-Aware Minimization (SAM) has established itself as a powerful and widely adopted optimizer for training machine learning models. By explicitly minimizing the sharpness of the loss landscape, SAM often improves generalization while delivering strong empirical performance. However, SAM and its variants, like most training algorithms, are sensitive to the choice of learning rate, which is typically tuned by trial and error or via schedulers. In this work, motivated by recent advances on the effectiveness of stochastic Polyak step sizes for Stochastic Gradient Descent (SGD), we derive Polyak schedulers tailored to SAM-style updates, yielding novel adaptive algorithms in both deterministic and stochastic settings. In smooth setting, for the proposed methods, we prove linear convergence for strongly convex objectives and an $O(1/T)$ rate (up to a neighborhood in the stochastic setting) for convex objectives. Numerical experiments demonstrate that the proposed Polyak schedulers match or surpass tuned SAM baselines while substantially reducing the need for learning-rate tuning.
Successful Page Load