Skip to yearly menu bar Skip to main content


Poster

Double-Step Alternating Extragradient with Increasing Timescale Separation for Finding Local Minimax Points: Provable Improvements

Kyuwon Kim · Donghwan Kim


Abstract:

In nonconvex-nonconcave minimax optimization, two-timescale gradient methods have shown their potential to find local minimax (optimal) points, provided that the timescale separation between the min and the max player is sufficiently large. However, existing two-timescale variants of gradient descent ascent (GDA) and extragradient (EG) methods face two shortcomings, especially when we search for degenerate local minimax points that are prevalent in modern overparameterized setting. In specific, (1) they can be unstable at some degenerate points even with sufficiently large timescale separation, and even (2) computing a proper amount of timescale separation is infeasible in practice. To remedy these two issues, we propose to incorporate two simple but provably effective schemes, double-step alternating update and increasing timescale separation, into the two-timescale EG, respectively. Under mild conditions, we show that the proposed methods converge to degenerate local minimax points that all existing two-timescale methods fail to converge.

Live content is unavailable. Log in and register to view live content