Semi-Cyclic Stochastic Gradient Descent
Hubert Eichner · Tomer Koren · H. Brendan McMahan · Nati Srebro · Kunal Talwar

Thu Jun 13th 05:15 -- 05:20 PM @ Seaside Ballroom

We consider convex SGD updates with a blockcyclic structure, i.e. where each cycle consists of a small number of blocks, each with many samples from a possibly different, block-specific, distribution. This situation arises, e.g., in Federated Learning where the mobile devices available for updates at different times during the day have different characteristics. We show that such block-cyclic structure can significantly deteriorate the performance of SGD, but propose a simple correction approach that allows prediction with the same performance guarantees as for i.i.d., non-cyclic, sampling.

Hubert Eichner (Google)
Tomer Koren (Google Brain)
Brendan McMahan (Google)
Nati Srebro (Toyota Technological Institute at Chicago)
Kunal Talwar (Google)

