Timezone: »
In scalable machine learning systems, model training is often parallelized over multiple nodes that run without tight synchronization. Most analysis results for the related asynchronous algorithms use an upper bound on the information delays in the system to determine learning rates. Not only are such bounds hard to obtain in advance, but they also result in unnecessarily slow convergence. In this paper, we show that it is possible to use learning rates that depend on the actual time-varying delays in the system. We develop general convergence results for delay-adaptive asynchronous iterations and specialize these to proximal incremental gradient descent and block coordinate descent algorithms. For each of these methods, we demonstrate how delays can be measured on-line, present delay-adaptive step-size policies, and illustrate their theoretical and practical advantages over the state-of-the-art.
Author Information
Xuyang Wu (KTH Royal Institute of Technology)
Sindri Magnússon (Stockholm University)
Hamid Reza Feyzmahdavian (ABB)
Mikael Johansson (KTH Royal Institute of Technology)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Spotlight: Delay-Adaptive Step-sizes for Asynchronous Learning »
Thu. Jul 21st 07:40 -- 07:45 PM Room Room 310
More from the Same Authors
-
2023 Poster: Generalized Polyak Step Size for First Order Optimization with Momentum »
Xiaoyu Wang · Mikael Johansson · Tong Zhang -
2023 Poster: Delay-agnostic Asynchronous Coordinate Update Algorithm »
Xuyang Wu · Changxin Liu · Sindri Magnússon · Mikael Johansson -
2021 Poster: Stability and Convergence of Stochastic Gradient Clipping: Beyond Lipschitz Continuity and Smoothness »
Vien Mai · Mikael Johansson -
2021 Oral: Stability and Convergence of Stochastic Gradient Clipping: Beyond Lipschitz Continuity and Smoothness »
Vien Mai · Mikael Johansson -
2020 Poster: Anderson Acceleration of Proximal Gradient Methods »
Vien Mai · Mikael Johansson -
2020 Poster: Convergence of a Stochastic Gradient Method with Momentum for Non-Smooth Non-Convex Optimization »
Vien Mai · Mikael Johansson -
2019 Poster: Curvature-Exploiting Acceleration of Elastic Net Computations »
Vien Mai · Mikael Johansson -
2019 Oral: Curvature-Exploiting Acceleration of Elastic Net Computations »
Vien Mai · Mikael Johansson