Timezone: »
One of the most widely used optimization methods for large-scale machine learning problems is distributed asynchronous stochastic gradient descent (DASGD). However, a key issue that arises here is that of delayed gradients: when a “worker” node asynchronously contributes a gradient update to the “master”, the global model parameter may have changed, rendering this information stale. In massively parallel computing grids, these delays can quickly add up if the computational throughput of a node is saturated, so the convergence of DASGD is uncertain under these conditions. Nevertheless, by using a judiciously chosen quasilinear step-size sequence, we show that it is possible to amortize these delays and achieve global convergence with probability 1, even when the delays grow at a polynomial rate. In this way, our results help reaffirm the successful application of DASGD to large-scale optimization problems.
Author Information
Zhengyuan Zhou (Stanford University)
Panayotis Mertikopoulos (CNRS)
Nicholas Bambos (Stanford University)
Peter Glynn (Stanford University)
Yinyu Ye
Li-Jia Li (Google)
Li Fei-Fei (Stanford University)
Related Events (a corresponding poster, oral, or spotlight)
-
2018 Oral: Distributed Asynchronous Optimization with Unbounded Delays: How Slow Can You Go? »
Thu Jul 12th 09:50 -- 10:00 AM Room A9
More from the Same Authors
-
2020 Poster: Gradient-free Online Learning in Continuous Games with Delayed Rewards »
Amélie Héliou · Panayotis Mertikopoulos · Zhengyuan Zhou -
2020 Poster: My Fair Bandit: Distributed Learning of Max-Min Fairness with Multi-player Bandits »
Ilai Bistritz · Tavor Z Baharav · Amir Leshem · Nicholas Bambos -
2020 Poster: A new regret analysis for Adam-type algorithms »
Ahmet Alacaoglu · Yura Malitsky · Panayotis Mertikopoulos · Volkan Cevher -
2020 Poster: Finite-Time Last-Iterate Convergence for Multi-Agent Learning in Games »
Darren Lin · Zhengyuan Zhou · Panayotis Mertikopoulos · Michael Jordan -
2020 Poster: Distributionally Robust Policy Evaluation and Learning in Offline Contextual Bandits »
Nian Si · Fan Zhang · Zhengyuan Zhou · Jose Blanchet -
2019 Poster: Cautious Regret Minimization: Online Optimization with Long-Term Budget Constraints »
Nikolaos Liakopoulos · Apostolos Destounis · Georgios Paschos · Thrasyvoulos Spyropoulos · Panayotis Mertikopoulos -
2019 Oral: Cautious Regret Minimization: Online Optimization with Long-Term Budget Constraints »
Nikolaos Liakopoulos · Apostolos Destounis · Georgios Paschos · Thrasyvoulos Spyropoulos · Panayotis Mertikopoulos -
2019 Poster: Probability Functional Descent: A Unifying Perspective on GANs, Variational Inference, and Reinforcement Learning »
Casey Chu · Jose Blanchet · Peter Glynn -
2019 Oral: Probability Functional Descent: A Unifying Perspective on GANs, Variational Inference, and Reinforcement Learning »
Casey Chu · Jose Blanchet · Peter Glynn -
2018 Poster: MentorNet: Learning Data-Driven Curriculum for Very Deep Neural Networks on Corrupted Labels »
Lu Jiang · Zhengyuan Zhou · Thomas Leung · Li-Jia Li · Li Fei-Fei -
2018 Oral: MentorNet: Learning Data-Driven Curriculum for Very Deep Neural Networks on Corrupted Labels »
Lu Jiang · Zhengyuan Zhou · Thomas Leung · Li-Jia Li · Li Fei-Fei