Timezone: »

 
Poster
FetchSGD: Communication-Efficient Federated Learning with Sketching
Daniel Rothchild · Ashwinee Panda · Enayat Ullah · Nikita Ivkin · Ion Stoica · Vladimir Braverman · Joseph E Gonzalez · Raman Arora

Tue Jul 14 07:00 AM -- 07:45 AM & Tue Jul 14 06:00 PM -- 06:45 PM (PDT) @ None #None

Existing approaches to federated learning suffer from a communication bottleneck as well as convergence issues due to sparse client participation. In this paper we introduce a novel algorithm, called FetchSGD, to overcome these challenges. FetchSGD compresses model updates using a Count Sketch, and then takes advantage of the mergeability of sketches to combine model updates from many workers. A key insight in the design of FetchSGD is that, because the Count Sketch is linear, momentum and error accumulation can both be carried out within the sketch. This allows the algorithm to move momentum and error accumulation from clients to the central aggregator, overcoming the challenges of sparse client participation while still achieving high compression rates and good convergence. We prove that FetchSGD has favorable convergence guarantees, and we demonstrate its empirical effectiveness by training two residual networks and a transformer model.

Author Information

Daniel Rothchild (UC Berkeley)
Ashwinee Panda (UC Berkeley)
Enayat Ullah (Johns Hopkins University)
Nikita Ivkin (Amazon)
Ion Stoica (UC Berkeley)
Vladimir Braverman (Johns Hopkins University)
Joseph E Gonzalez (UC Berkeley)
Raman Arora (Johns Hopkins University)
Raman Arora

Raman Arora received his M.S. and Ph.D. degrees in Electrical and Computer Engineering from the University of Wisconsin-Madison in 2005 and 2009, respectively. From 2009-2011, he was a Postdoctoral Research Associate at the University of Washington in Seattle and a Visiting Researcher at Microsoft Research Redmond. Since 2011, he has been with Toyota Technological Institute at Chicago (TTIC). His research interests include machine learning, speech recognition and statistical signal processing.

More from the Same Authors