Timezone: »
Poster
A new regret analysis for Adam-type algorithms
Ahmet Alacaoglu · Yura Malitsky · Panayotis Mertikopoulos · Volkan Cevher
Wed Jul 15 12:00 PM -- 12:45 PM & Thu Jul 16 12:00 AM -- 12:45 AM (PDT) @
In this paper, we focus on a theory-practice gap for Adam and its variants (AMSGrad, AdamNC, etc.). In practice, these algorithms are used with a constant first-order moment parameter $\beta_{1}$ (typically between $0.9$ and $0.99$). In theory, regret guarantees for online convex optimization require a rapidly decaying $\beta_{1}\to0$ schedule. We show that this is an artifact of the standard analysis, and we propose a novel framework that allows us to derive optimal, data-dependent regret bounds with a constant $\beta_{1}$, without further assumptions. We also demonstrate the flexibility of our analysis on a wide range of different algorithms and settings.
Author Information
Ahmet Alacaoglu (EPFL)
Yura Malitsky (EPFL)
Panayotis Mertikopoulos (CNRS and Criteo AI Lab)
Volkan Cevher (EPFL)
More from the Same Authors
-
2022 : Robustness in deep learning: The width (good), the depth (bad), and the initialization (ugly) »
Zhenyu Zhu · Fanghui Liu · Grigorios Chrysos · Volkan Cevher -
2022 : Sound and Complete Verification of Polynomial Networks »
Elias Abad Rocamora · Mehmet Fatih Sahin · Fanghui Liu · Grigorios Chrysos · Volkan Cevher -
2023 : Adversarial Training Should Be Cast as a Non-Zero-Sum Game »
Alex Robey · Fabian Latorre · George J. Pappas · Hamed Hassani · Volkan Cevher -
2023 Oral: Semi Bandit dynamics in Congestion Games: Convergence to Nash Equilibrium and No-Regret Guarantees. »
Ioannis Panageas · EFSTRATIOS PANTELEIMON SKOULAKIS · Luca Viano · Xiao Wang · Volkan Cevher -
2023 Poster: When do Minimax-fair Learning and Empirical Risk Minimization Coincide? »
Harvineet Singh · Matthäus Kleindessner · Volkan Cevher · Rumi Chunara · Chris Russell -
2023 Poster: Benign Overfitting in Deep Neural Networks under Lazy Training »
Zhenyu Zhu · Fanghui Liu · Grigorios Chrysos · Francesco Locatello · Volkan Cevher -
2023 Poster: What can online reinforcement learning with function approximation benefit from general coverage conditions? »
Fanghui Liu · Luca Viano · Volkan Cevher -
2023 Poster: Semi Bandit dynamics in Congestion Games: Convergence to Nash Equilibrium and No-Regret Guarantees. »
Ioannis Panageas · EFSTRATIOS PANTELEIMON SKOULAKIS · Luca Viano · Xiao Wang · Volkan Cevher -
2023 Poster: Multi-Agent Online Optimization with Delays: Asynchronicity, Adaptivity, and Optimism »
Yu-Guan Hsieh · Franck Iutzeler · Jérôme Malick · Panayotis Mertikopoulos -
2023 : 1-Path-Norm Regularization of Deep Neural Networks »
Fabian Latorre · Antoine Bonnet · Paul Rolland · Nadav Hallak · Volkan Cevher -
2023 : 1-Path-Norm Regularization of Deep Neural Networks »
Fabian Latorre · Antoine Bonnet · Paul Rolland · Nadav Hallak · Volkan Cevher -
2022 Poster: Score Matching Enables Causal Discovery of Nonlinear Additive Noise Models »
Paul Rolland · Volkan Cevher · Matthäus Kleindessner · Chris Russell · Dominik Janzing · Bernhard Schölkopf · Francesco Locatello -
2022 Poster: Nested Bandits »
Matthieu Martin · Panayotis Mertikopoulos · Thibaud J Rahier · Houssam Zenati -
2022 Poster: UnderGrad: A Universal Black-Box Optimization Method with Almost Dimension-Free Convergence Rate Guarantees »
Kimon Antonakopoulos · Dong Quan Vu · Volkan Cevher · Kfir Levy · Panayotis Mertikopoulos -
2022 Oral: UnderGrad: A Universal Black-Box Optimization Method with Almost Dimension-Free Convergence Rate Guarantees »
Kimon Antonakopoulos · Dong Quan Vu · Volkan Cevher · Kfir Levy · Panayotis Mertikopoulos -
2022 Spotlight: Nested Bandits »
Matthieu Martin · Panayotis Mertikopoulos · Thibaud J Rahier · Houssam Zenati -
2022 Oral: Score Matching Enables Causal Discovery of Nonlinear Additive Noise Models »
Paul Rolland · Volkan Cevher · Matthäus Kleindessner · Chris Russell · Dominik Janzing · Bernhard Schölkopf · Francesco Locatello -
2022 Poster: A Natural Actor-Critic Framework for Zero-Sum Markov Games »
Ahmet Alacaoglu · Luca Viano · Niao He · Volkan Cevher -
2022 Spotlight: A Natural Actor-Critic Framework for Zero-Sum Markov Games »
Ahmet Alacaoglu · Luca Viano · Niao He · Volkan Cevher -
2022 Poster: AdaGrad Avoids Saddle Points »
Kimon Antonakopoulos · Panayotis Mertikopoulos · Georgios Piliouras · Xiao Wang -
2022 Spotlight: AdaGrad Avoids Saddle Points »
Kimon Antonakopoulos · Panayotis Mertikopoulos · Georgios Piliouras · Xiao Wang -
2021 Poster: The Limits of Min-Max Optimization Algorithms: Convergence to Spurious Non-Critical Sets »
Ya-Ping Hsieh · Panayotis Mertikopoulos · Volkan Cevher -
2021 Poster: Regret Minimization in Stochastic Non-Convex Learning via a Proximal-Gradient Approach »
Nadav Hallak · Panayotis Mertikopoulos · Volkan Cevher -
2021 Spotlight: Regret Minimization in Stochastic Non-Convex Learning via a Proximal-Gradient Approach »
Nadav Hallak · Panayotis Mertikopoulos · Volkan Cevher -
2021 Oral: The Limits of Min-Max Optimization Algorithms: Convergence to Spurious Non-Critical Sets »
Ya-Ping Hsieh · Panayotis Mertikopoulos · Volkan Cevher -
2021 Poster: Zeroth-Order Non-Convex Learning via Hierarchical Dual Averaging »
Amélie Héliou · Matthieu Martin · Panayotis Mertikopoulos · Thibaud J Rahier -
2021 Spotlight: Zeroth-Order Non-Convex Learning via Hierarchical Dual Averaging »
Amélie Héliou · Matthieu Martin · Panayotis Mertikopoulos · Thibaud J Rahier -
2020 Poster: Efficient Proximal Mapping of the 1-path-norm of Shallow Networks »
Fabian Latorre · Paul Rolland · Shaul Nadav Hallak · Volkan Cevher -
2020 Poster: Conditional gradient methods for stochastically constrained convex minimization »
Maria-Luiza Vladarean · Ahmet Alacaoglu · Ya-Ping Hsieh · Volkan Cevher -
2020 Poster: Random extrapolation for primal-dual coordinate descent »
Ahmet Alacaoglu · Olivier Fercoq · Volkan Cevher -
2020 Poster: Double-Loop Unadjusted Langevin Algorithm »
Paul Rolland · Armin Eftekhari · Ali Kavis · Volkan Cevher -
2020 Poster: Gradient-free Online Learning in Continuous Games with Delayed Rewards »
Amélie Héliou · Panayotis Mertikopoulos · Zhengyuan Zhou -
2020 Poster: Adaptive Gradient Descent without Descent »
Yura Malitsky · Konstantin Mishchenko -
2020 Poster: Finite-Time Last-Iterate Convergence for Multi-Agent Learning in Games »
Tianyi Lin · Zhengyuan Zhou · Panayotis Mertikopoulos · Michael Jordan -
2019 Poster: Cautious Regret Minimization: Online Optimization with Long-Term Budget Constraints »
Nikolaos Liakopoulos · Apostolos Destounis · Georgios Paschos · Thrasyvoulos Spyropoulos · Panayotis Mertikopoulos -
2019 Oral: Cautious Regret Minimization: Online Optimization with Long-Term Budget Constraints »
Nikolaos Liakopoulos · Apostolos Destounis · Georgios Paschos · Thrasyvoulos Spyropoulos · Panayotis Mertikopoulos -
2019 Poster: Almost surely constrained convex optimization »
Olivier Fercoq · Ahmet Alacaoglu · Ion Necoara · Volkan Cevher -
2019 Poster: Finding Mixed Nash Equilibria of Generative Adversarial Networks »
Ya-Ping Hsieh · Chen Liu · Volkan Cevher -
2019 Poster: Efficient learning of smooth probability functions from Bernoulli tests with guarantees »
Paul Rolland · Ali Kavis · Alexander Niklaus Immer · Adish Singla · Volkan Cevher -
2019 Oral: Finding Mixed Nash Equilibria of Generative Adversarial Networks »
Ya-Ping Hsieh · Chen Liu · Volkan Cevher -
2019 Oral: Efficient learning of smooth probability functions from Bernoulli tests with guarantees »
Paul Rolland · Ali Kavis · Alexander Niklaus Immer · Adish Singla · Volkan Cevher -
2019 Oral: Almost surely constrained convex optimization »
Olivier Fercoq · Ahmet Alacaoglu · Ion Necoara · Volkan Cevher -
2019 Poster: On Certifying Non-Uniform Bounds against Adversarial Attacks »
Chen Liu · Ryota Tomioka · Volkan Cevher -
2019 Poster: Conditional Gradient Methods via Stochastic Path-Integrated Differential Estimator »
Alp Yurtsever · Suvrit Sra · Volkan Cevher -
2019 Poster: A Conditional-Gradient-Based Augmented Lagrangian Framework »
Alp Yurtsever · Olivier Fercoq · Volkan Cevher -
2019 Oral: Conditional Gradient Methods via Stochastic Path-Integrated Differential Estimator »
Alp Yurtsever · Suvrit Sra · Volkan Cevher -
2019 Oral: A Conditional-Gradient-Based Augmented Lagrangian Framework »
Alp Yurtsever · Olivier Fercoq · Volkan Cevher -
2019 Oral: On Certifying Non-Uniform Bounds against Adversarial Attacks »
Chen Liu · Ryota Tomioka · Volkan Cevher -
2018 Poster: A Conditional Gradient Framework for Composite Convex Minimization with Applications to Semidefinite Programming »
Alp Yurtsever · Olivier Fercoq · Francesco Locatello · Volkan Cevher -
2018 Oral: A Conditional Gradient Framework for Composite Convex Minimization with Applications to Semidefinite Programming »
Alp Yurtsever · Olivier Fercoq · Francesco Locatello · Volkan Cevher -
2018 Poster: Let’s be Honest: An Optimal No-Regret Framework for Zero-Sum Games »
Ehsan Asadi Kangarshahi · Ya-Ping Hsieh · Mehmet Fatih Sahin · Volkan Cevher -
2018 Poster: Distributed Asynchronous Optimization with Unbounded Delays: How Slow Can You Go? »
Zhengyuan Zhou · Panayotis Mertikopoulos · Nicholas Bambos · Peter Glynn · Yinyu Ye · Li-Jia Li · Li Fei-Fei -
2018 Poster: Optimal Distributed Learning with Multi-pass Stochastic Gradient Methods »
Junhong Lin · Volkan Cevher -
2018 Oral: Let’s be Honest: An Optimal No-Regret Framework for Zero-Sum Games »
Ehsan Asadi Kangarshahi · Ya-Ping Hsieh · Mehmet Fatih Sahin · Volkan Cevher -
2018 Oral: Optimal Distributed Learning with Multi-pass Stochastic Gradient Methods »
Junhong Lin · Volkan Cevher -
2018 Oral: Distributed Asynchronous Optimization with Unbounded Delays: How Slow Can You Go? »
Zhengyuan Zhou · Panayotis Mertikopoulos · Nicholas Bambos · Peter Glynn · Yinyu Ye · Li-Jia Li · Li Fei-Fei -
2018 Poster: Optimal Rates of Sketched-regularized Algorithms for Least-Squares Regression over Hilbert Spaces »
Junhong Lin · Volkan Cevher -
2018 Oral: Optimal Rates of Sketched-regularized Algorithms for Least-Squares Regression over Hilbert Spaces »
Junhong Lin · Volkan Cevher -
2017 Poster: Robust Submodular Maximization: A Non-Uniform Partitioning Approach »
Ilija Bogunovic · Slobodan Mitrovic · Jonathan Scarlett · Volkan Cevher -
2017 Talk: Robust Submodular Maximization: A Non-Uniform Partitioning Approach »
Ilija Bogunovic · Slobodan Mitrovic · Jonathan Scarlett · Volkan Cevher