Timezone: »
Poster
Bias-Variance Reduced Local SGD for Less Heterogeneous Federated Learning
Tomoya Murata · Taiji Suzuki
Recently, local SGD has got much attention and been extensively studied in the distributed learning community to overcome the communication bottleneck problem. However, the superiority of local SGD to minibatch SGD only holds in quite limited situations. In this paper, we study a new local algorithm called Bias-Variance Reduced Local SGD (BVR-L-SGD) for nonconvex distributed optimization. Algorithmically, our proposed bias and variance reduced local gradient estimator fully utilizes small second-order heterogeneity of local objectives and suggests randomly picking up one of the local models instead of taking the average of them when workers are synchronized. Theoretically, under small heterogeneity of local objectives, we show that BVR-L-SGD achieves better communication complexity than both the previous non-local and local methods under mild conditions, and particularly BVR-L-SGD is the first method that breaks the barrier of communication complexity $\Theta(1/\varepsilon)$ for general nonconvex smooth objectives when the heterogeneity is small and the local computation budget is large. Numerical results are given to verify the theoretical findings and give empirical evidence of the superiority of our method.
Author Information
Tomoya Murata (NTT DATA Mathematical Systems Inc.)
Taiji Suzuki (The University of Tokyo / RIKEN)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Spotlight: Bias-Variance Reduced Local SGD for Less Heterogeneous Federated Learning »
Tue. Jul 20th 12:25 -- 12:30 PM Room
More from the Same Authors
-
2021 Poster: On Learnability via Gradient Method for Two-Layer ReLU Neural Networks in Teacher-Student Setting »
Shunta Akiyama · Taiji Suzuki -
2021 Spotlight: On Learnability via Gradient Method for Two-Layer ReLU Neural Networks in Teacher-Student Setting »
Shunta Akiyama · Taiji Suzuki -
2021 Poster: Quantitative Understanding of VAE as a Non-linearly Scaled Isometric Embedding »
Akira Nakagawa · Keizo Kato · Taiji Suzuki -
2021 Spotlight: Quantitative Understanding of VAE as a Non-linearly Scaled Isometric Embedding »
Akira Nakagawa · Keizo Kato · Taiji Suzuki -
2019 Poster: Approximation and non-parametric estimation of ResNet-type convolutional neural networks »
Kenta Oono · Taiji Suzuki -
2019 Oral: Approximation and non-parametric estimation of ResNet-type convolutional neural networks »
Kenta Oono · Taiji Suzuki -
2018 Poster: Functional Gradient Boosting based on Residual Network Perception »
Atsushi Nitanda · Taiji Suzuki -
2018 Oral: Functional Gradient Boosting based on Residual Network Perception »
Atsushi Nitanda · Taiji Suzuki