Timezone: »

 
Oral
Trading Redundancy for Communication: Speeding up Distributed SGD for Non-convex Optimization
Farzin Haddadpour · Mohammad Mahdi Kamani · Mehrdad Mahdavi · Viveck Cadambe

Wed Jun 12 11:35 AM -- 11:40 AM (PDT) @ Room 103

The communication overhead is one of the key challenges that hinders the scalability of distributed optimization algorithms to train large neural networks. In recent years, there has been a great deal of research to alleviate communication cost by compressing the gradient vector or using local updates and periodic model averaging. In this paper, we aim at developing communication-efficient distributed stochastic algorithms for non-convex optimization by effective data replication strategies. In particular, we, both theoretically and practically, show that by properly infusing redundancy to the training data with model averaging, it is possible to significantly reduce the number of communications rounds. To be more precise, for a predetermined level of redundancy, the proposed algorithm samples min-batches from redundant chunks of data from multiple workers in updating local solutions. As a byproduct, we also show that the proposed algorithm is robust to failures. Our empirical studies on CIFAR10 and CIFAR100 datasets in a distributed environment complement our theoretical results.

Author Information

Farzin Haddadpour (Pennsylvania State University)
Mohammad Mahdi Kamani (The Pennsylvania State University)
Mehrdad Mahdavi (Pennsylvania State University)
Viveck Cadambe (Pennsylvania State University)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors

  • 2019 : Poster Session I »
    Stark Draper · Mehmet Aktas · Basak Guler · Hongyi Wang · Venkata Gandikota · Hyegyeong Park · Jinhyun So · Lev Tauz · hema venkata krishna giri Narra · Zhifeng Lin · Mohammadali Maddahali · Yaoqing Yang · Sanghamitra Dutta · Amirhossein Reisizadeh · Jianyu Wang · Eren Balevi · Siddharth Jain · Paul McVay · Michael Rudow · Pedro Soto · Jun Li · Adarsh Subramaniam · Umut Demirhan · Vipul Gupta · Deniz Oktay · Leighton P Barnes · Johannes BallĂ© · Farzin Haddadpour · Haewon Jeong · Rong-Rong Chen · Mohammad Fahim
  • 2019 : Targeted Meta-Learning for Critical Incident Detection in Weather Data »
    Mohammad Mahdi Kamani · Sadegh Farhang · Mehrdad Mahdavi · James Wang
  • 2019 : Networking Lunch (provided) + Poster Session »
    Abraham Stanway · Alex Robson · Aneesh Rangnekar · Ashesh Chattopadhyay · Ashley Pilipiszyn · Benjamin LeRoy · Bolong Cheng · Ce Zhang · Chaopeng Shen · Christian Schroeder · Christian Clough · Clement DUHART · Clement Fung · Cozmin Ududec · Dali Wang · David Dao · di wu · Dimitrios Giannakis · Dino Sejdinovic · Doina Precup · Duncan Watson-Parris · Gege Wen · George Chen · Gopal Erinjippurath · Haifeng Li · Han Zou · Herke van Hoof · Hillary A Scannell · Hiroshi Mamitsuka · Hongbao Zhang · Jaegul Choo · James Wang · James Requeima · Jessica Hwang · Jinfan Xu · Johan Mathe · Jonathan Binas · Joonseok Lee · Kalai Ramea · Kate Duffy · Kevin McCloskey · Kris Sankaran · Lester Mackey · Letif Mones · Loubna Benabbou · Lynn Kaack · Matthew Hoffman · Mayur Mudigonda · Mehrdad Mahdavi · Michael McCourt · Mingchao Jiang · Mohammad Mahdi Kamani · Neel Guha · Niccolo Dalmasso · Nick Pawlowski · Nikola Milojevic-Dupont · Paulo Orenstein · Pedram Hassanzadeh · Pekka Marttinen · Ramesh Nair · Sadegh Farhang · Samuel Kaski · Sandeep Manjanna · Sasha Luccioni · Shuby Deshpande · Soo Kim · Soukayna Mouatadid · Sunghyun Park · Tao Lin · Telmo Felgueira · Thomas Hornigold · Tianle Yuan · Tom Beucler · Tracy Cui · Volodymyr Kuleshov · Wei Yu · yang song · Ydo Wexler · Yoshua Bengio · Zhecheng Wang · Zhuangfang Yi · Zouheir Malki