Timezone: »
The success of minimax learning problems of generative adversarial networks (GANs) has been observed to depend on the minimax optimization algorithm used for their training. This dependence is commonly attributed to the convergence speed and robustness properties of the underlying optimization algorithm. In this paper, we show that the optimization algorithm also plays a key role in the generalization performance of the trained minimax model. To this end, we analyze the generalization properties of standard gradient descent ascent (GDA) and proximal point method (PPM) algorithms through the lens of algorithmic stability as defined by Bousquet & Elisseeff, 2002 under both convex-concave and nonconvex-nonconcave minimax settings. While the GDA algorithm is not guaranteed to have a vanishing excess risk in convex-concave problems, we show the PPM algorithm enjoys a bounded excess risk in the same setup. For nonconvex-nonconcave problems, we compare the generalization performance of stochastic GDA and GDmax algorithms where the latter fully solves the maximization subproblem at every iteration. Our generalization analysis suggests the superiority of GDA provided that the minimization and maximization subproblems are solved simultaneously with similar learning rates. We discuss several numerical results indicating the role of optimization algorithms in the generalization of learned minimax models.
Author Information
Farzan Farnia (MIT)
Asuman Ozdaglar (MIT)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Spotlight: Train simultaneously, generalize better: Stability of gradient-based minimax learners »
Thu. Jul 22nd 01:45 -- 01:50 AM Room
More from the Same Authors
-
2021 : Decentralized Q-Learning in Zero-sum Markov Games »
Kaiqing Zhang · David Leslie · Tamer Basar · Asuman Ozdaglar -
2023 : The Power of Duality Principle in Offline Average-Reward Reinforcement Learning »
Asuman Ozdaglar · Sarath Pattathil · Jiawei Zhang · Kaiqing Zhang -
2023 : Time-Reversed Dissipation Induces Duality Between Minimizing Gradient Norm and Function Value »
Jaeyeon Kim · Asuman Ozdaglar · Chanwoo Park · Ernest Ryu -
2023 Poster: Revisiting the Linear-Programming Framework for Offline RL with General Function Approximation »
Asuman Ozdaglar · Sarath Pattathil · Jiawei Zhang · Kaiqing Zhang -
2022 : What is a Good Metric to Study Generalization of Minimax Learners? »
Asuman Ozdaglar · Sarath Pattathil · Jiawei Zhang · Kaiqing Zhang -
2021 Poster: A Wasserstein Minimax Framework for Mixed Linear Regression »
Theo Diamandis · Yonina Eldar · Alireza Fallah · Farzan Farnia · Asuman Ozdaglar -
2021 Oral: A Wasserstein Minimax Framework for Mixed Linear Regression »
Theo Diamandis · Yonina Eldar · Alireza Fallah · Farzan Farnia · Asuman Ozdaglar -
2020 Poster: Do GANs always have Nash equilibria? »
Farzan Farnia · Asuman Ozdaglar