Timezone: »
Distributed data-parallel algorithms aim to accelerate the training of deep neural networks by parallelizing the computation of large mini-batch gradient updates across multiple nodes. Approaches that synchronize nodes using exact distributed averaging (e.g., via AllReduce) are sensitive to stragglers and communication delays. The PushSum gossip algorithm is robust to these issues, but only performs approximate distributed averaging. This paper studies Stochastic Gradient Push (SGP), which combines PushSum with stochastic gradient updates. We prove that SGP converges to a stationary point of smooth, non-convex objectives at the same sub-linear rate as SGD, and that all nodes achieve consensus. We empirically validate the performance of SGP on image classification (ResNet-50, ImageNet) and machine translation (Transformer, WMT'16 En-De) workloads.
Author Information
Mahmoud Assran (McGill University/Facebook AI Research)
Nicolas Loizou (The University of Edinburgh)
https://www.maths.ed.ac.uk/~s1461357/
Nicolas Ballas (Facebook FAIR)
Michael Rabbat (Facebook)
Related Events (a corresponding poster, oral, or spotlight)
-
2019 Oral: Stochastic Gradient Push for Distributed Deep Learning »
Wed. Jun 12th 06:25 -- 06:30 PM Room Room 103
More from the Same Authors
-
2022 : Positive Unlabeled Contrastive Representation Learning »
Anish Acharya · Sujay Sanghavi · Li Jing · Bhargav Bhushanam · Michael Rabbat · Dhruv Choudhary · Inderjit Dhillon -
2022 Poster: Federated Learning with Partial Model Personalization »
Krishna Pillutla · Kshitiz Malik · Abdel-rahman Mohamed · Michael Rabbat · Maziar Sanjabi · Lin Xiao -
2022 Spotlight: Federated Learning with Partial Model Personalization »
Krishna Pillutla · Kshitiz Malik · Abdel-rahman Mohamed · Michael Rabbat · Maziar Sanjabi · Lin Xiao -
2020 Poster: On the Convergence of Nesterov's Accelerated Gradient Method in Stochastic Settings »
Mahmoud Assran · Michael Rabbat -
2019 Poster: TarMAC: Targeted Multi-Agent Communication »
Abhishek Das · Theophile Gervet · Joshua Romoff · Dhruv Batra · Devi Parikh · Michael Rabbat · Joelle Pineau -
2019 Oral: TarMAC: Targeted Multi-Agent Communication »
Abhishek Das · Theophile Gervet · Joshua Romoff · Dhruv Batra · Devi Parikh · Michael Rabbat · Joelle Pineau -
2019 Poster: SGD: General Analysis and Improved Rates »
Robert Gower · Nicolas Loizou · Xun Qian · Alibek Sailanbayev · Egor Shulgin · Peter Richtarik -
2019 Oral: SGD: General Analysis and Improved Rates »
Robert Gower · Nicolas Loizou · Xun Qian · Alibek Sailanbayev · Egor Shulgin · Peter Richtarik