Timezone: »

Spotlight
On Convergence of Gradient Descent Ascent: A Tight Local Analysis
Haochuan Li · Farzan Farnia · Subhro Das · Ali Jadbabaie

Tue Jul 19 02:15 PM -- 02:20 PM (PDT) @ Room 309
Gradient Descent Ascent (GDA) methods are the mainstream algorithms for minimax optimization in generative adversarial networks (GANs). Convergence properties of GDA have drawn significant interest in the recent literature. Specifically, for $\min_{x} \max_{y} f(x;y)$ where $f$ is strongly-concave in $y$ and possibly nonconvex in $x$, (Lin et al., 2020) proved the convergence of GDA with a stepsize ratio $\eta_y/\eta_x=\Theta(\kappa^2)$ where $\eta_x$ and $\eta_y$ are the stepsizes for $x$ and $y$ and $\kappa$ is the condition number for $y$. While this stepsize ratio suggests a slow training of the min player, practical GAN algorithms typically adopt similar stepsizes for both variables, indicating a wide gap between theoretical and empirical results. In this paper, we aim to bridge this gap by analyzing the \emph{local convergence} of general \emph{nonconvex-nonconcave} minimax problems. We demonstrate that a stepsize ratio of $\Theta(\kappa)$ is necessary and sufficient for local convergence of GDA to a Stackelberg Equilibrium, where $\kappa$ is the local condition number for $y$. We prove a nearly tight convergence rate with a matching lower bound. We further extend the convergence guarantees to stochastic GDA and extra-gradient methods (EG). Finally, we conduct several numerical experiments to support our theoretical findings.

#### Author Information

##### Subhro Das (MIT-IBM Watson AI Lab, IBM Research)

Subhro Das is a Research Staff Member and Manager at the MIT-IBM AI Lab, IBM Research, Cambridge MA. As a Principal Investigator (PI), he works on developing novel AI algorithms in collaboration with MIT. He is a Research Affiliate at MIT, co-leading IBM's engagement in the MIT Quest for Intelligence. He serves as the Chair of the AI Learning Professional Interest Community (PIC) at IBM Research. His research interests are broadly in the areas of Trustworthy ML, Reinforcement Learning and ML Optimization. At the MIT-IBM AI Lab, he works on developing novel AI algorithms for uncertainty quantification and human-centric AI systems; robust, accelerated, online & distributed optimization; and, safe, unstable & multi-agent reinforcement learning. He led the Future of Work initiative within IBM Research, studying the impact of AI on the labor market and developing AI-driven recommendation frameworks for skills and talent management. Previously, at the IBM T.J. Watson Research Center in New York, he worked on developing signal processing and machine learning based predictive algorithms for a broad variety of biomedical and healthcare applications. He received MS and PhD degrees in Electrical and Computer Engineering from Carnegie Mellon University in 2014 and 2016, respectively, and Bachelors (B.Tech.) degree in Electronics & Electrical Communication Engineering from Indian Institute of Technology Kharagpur in 2011.