Skip to yearly menu bar Skip to main content


Poster
in
Workshop: HiLD: High-dimensional Learning Dynamics Workshop

On the Advantage of Lion Compared to signSGD with Momentum

Alessandro Noiato · Luca Biggio · Antonio Orvieto


Abstract:

This paper explores the relationship between the recently introduced Lion Optimizer and the similar signSGD with momentum --- focusing on their different gradient averaging mechanisms prior to the sign operator. A precise formula for the averaging mechanisms of Lion and Signum is derived, and a comparison is made in the deterministic convex setting. Empirical investigations highlight the advantage of Lion, a finding further supported by a convergence guarantee in the stochastic setting. Our work suggests that Lion has the ability to effectively balance fast and slow averaging, leading to stable and rapid convergence.

Chat is not available.