Timezone: »

Gradient Coding: Avoiding Stragglers in Distributed Learning
Rashish Tandon · Qi Lei · Alexandros Dimakis · Nikos Karampatziakis

Mon Aug 07 11:24 PM -- 11:42 PM (PDT) @ C4.8

We propose a novel coding theoretic framework for mitigating stragglers in distributed learning. We show how carefully replicating data blocks and coding across gradients can provide tolerance to failures and stragglers for synchronous Gradient Descent. We implement our schemes in python (using MPI) to run on Amazon EC2, and show how we compare against baseline approaches in running time and generalization error.

Author Information

Rashish Tandon (University of Texas at Austin)
Qi Lei (University of Texas at Austin)
Alex Dimakis (UT Austin)

Alex Dimakis is an Associate Professor at the Electrical and Computer Engineering department, University of Texas at Austin. He received his Ph.D. in electrical engineering and computer sciences from UC Berkeley. He received an ARO young investigator award in 2014, the NSF Career award in 2011, a Google faculty research award in 2012 and the Eli Jury dissertation award in 2008. He is the co-recipient of several best paper awards including the joint Information Theory and Communications Society Best Paper Award in 2012. His research interests include information theory, coding theory and machine learning.

Nikos Karampatziakis (Microsoft)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors