Convergence Analysis of the Lion Optimizer in Centralized and Distributed Settings
Wei Jiang ⋅ Mao Xu ⋅ Wenhao Yang ⋅ Yibo Wang ⋅ Zechao Li ⋅ Lijun Zhang
Abstract
In this paper, we provide a comprehensive convergence analysis for the Lion optimizer. First, we establish that the original Lion achieves a convergence rate of $\mathcal{O}(d^{1/2}T^{-1/4})$, where $d$ denotes the problem dimension and $T$ is the iteration number. To improve this rate, we propose a variance reduction variant of Lion, which attains an enhanced rate of $\mathcal{O}(d^{1/2}T^{-1/3})$ with the average smoothness assumption. Then, we extend our analysis to distributed settings. We demonstrate that the distributed Lion optimizer and its variance reduction counterpart achieve linear speedup with respect to the number of nodes $n$, yielding convergence rates of $\mathcal{O}(d^{1/2}(nT)^{-1/4})$ and $\mathcal{O}(d^{1/2}(nT)^{-1/3})$, respectively. Additionally, we investigate a communication-efficient distributed Lion variant that utilizes sign compression for bidirectional communication. By employing unbiased sign operations, this variant achieves a convergence rate of $\mathcal{O} \left( \max \{ \frac{d^{1/4}}{T^{1/4}}, \frac{d^{1/10}}{n^{1/5}T^{1/5}} \} \right)$, and its variance-reduced counterpart can further improves the rate to $\mathcal{O}\left( \frac{d^{1/4}}{T^{1/4}} \right)$. Finally, we conduct numerical experiments to validate the effectiveness of the proposed methods.
Successful Page Load