Timezone: »
We study local SGD (also known as parallel SGD and federated SGD), a natural and frequently used distributed optimization method. Its theoretical foundations are currently lacking and we highlight how all existing error guarantees in the convex setting are dominated by a simple baseline, minibatch SGD. (1) For quadratic objectives we prove that local SGD strictly dominates minibatch SGD and that accelerated local SGD is minmax optimal for quadratics; (2) For general convex objectives we provide the first guarantee that at least \emph{sometimes} improves over minibatch SGD, but our guarantee does not always improve over, nor even match, minibatch SGD; (3) We show that indeed local SGD does \emph{not} dominate minibatch SGD by presenting a lower bound on the performance of local SGD that is worse than the minibatch SGD guarantee.
Author Information
Blake Woodworth (Toyota Technological Institute at Chicago)
Kumar Kshitij Patel (Toyota Technological Institute at Chicago)
Sebastian Stich (EPFL)
Zhen Dai (University of Chicago)
Brian Bullins (TTI Chicago)
Brendan McMahan (Google)
Ohad Shamir (Weizmann Institute of Science)
Nati Srebro (Toyota Technological Institute at Chicago)
More from the Same Authors
-
2023 : When is Agnostic Reinforcement Learning Statistically Tractable? »
Gene Li · Zeyu Jia · Alexander Rakhlin · Ayush Sekhari · Nati Srebro -
2023 : On the Still Unreasonable Effectiveness of Federated Averaging for Heterogeneous Distributed Learning »
Kumar Kshitij Patel · Margalit Glasgow · Lingxiao Wang · Nirmit Joshi · Nati Srebro -
2023 : Brendan McMahan: Advances in Privacy and Federated Learning, with Applications to GBoard »
Brendan McMahan -
2023 Poster: Federated Online and Bandit Convex Optimization »
Kumar Kshitij Patel · Lingxiao Wang · Aadirupa Saha · Nati Srebro -
2023 Poster: Continual Learning in Linear Classification on Separable Data »
Itay Evron · Edward Moroshko · Gon Buzaglo · Maroun Khriesh · Badea Marjieh · Nati Srebro · Daniel Soudry -
2022 Poster: Implicit Bias of the Step Size in Linear Diagonal Neural Networks »
Mor Shpigel Nacson · Kavya Ravichandran · Nati Srebro · Daniel Soudry -
2022 Spotlight: Implicit Bias of the Step Size in Linear Diagonal Neural Networks »
Mor Shpigel Nacson · Kavya Ravichandran · Nati Srebro · Daniel Soudry -
2021 : Algorithms for Efficient Federated and Decentralized Learning (Q&A) »
Sebastian Stich -
2021 : Algorithms for Efficient Federated and Decentralized Learning »
Sebastian Stich -
2021 Poster: Fast margin maximization via dual acceleration »
Ziwei Ji · Nati Srebro · Matus Telgarsky -
2021 Poster: Practical and Private (Deep) Learning Without Sampling or Shuffling »
Peter Kairouz · Brendan McMahan · Shuang Song · Om Dipakbhai Thakkar · Abhradeep Guha Thakurta · Zheng Xu -
2021 Spotlight: Fast margin maximization via dual acceleration »
Ziwei Ji · Nati Srebro · Matus Telgarsky -
2021 Spotlight: Practical and Private (Deep) Learning Without Sampling or Shuffling »
Peter Kairouz · Brendan McMahan · Shuang Song · Om Dipakbhai Thakkar · Abhradeep Guha Thakurta · Zheng Xu -
2021 Poster: Quantifying the Benefit of Using Differentiable Learning over Tangent Kernels »
Eran Malach · Pritish Kamath · Emmanuel Abbe · Nati Srebro -
2021 Spotlight: Quantifying the Benefit of Using Differentiable Learning over Tangent Kernels »
Eran Malach · Pritish Kamath · Emmanuel Abbe · Nati Srebro -
2021 Poster: Consensus Control for Decentralized Deep Learning »
Lingjing Kong · Tao Lin · Anastasiia Koloskova · Martin Jaggi · Sebastian Stich -
2021 Poster: Quasi-global Momentum: Accelerating Decentralized Deep Learning on Heterogeneous Data »
Tao Lin · Sai Praneeth Reddy Karimireddy · Sebastian Stich · Martin Jaggi -
2021 Poster: Dropout: Explicit Forms and Capacity Control »
Raman Arora · Peter Bartlett · Poorya Mianjy · Nati Srebro -
2021 Spotlight: Dropout: Explicit Forms and Capacity Control »
Raman Arora · Peter Bartlett · Poorya Mianjy · Nati Srebro -
2021 Spotlight: Quasi-global Momentum: Accelerating Decentralized Deep Learning on Heterogeneous Data »
Tao Lin · Sai Praneeth Reddy Karimireddy · Sebastian Stich · Martin Jaggi -
2021 Spotlight: Consensus Control for Decentralized Deep Learning »
Lingjing Kong · Tao Lin · Anastasiia Koloskova · Martin Jaggi · Sebastian Stich -
2021 Poster: On the Implicit Bias of Initialization Shape: Beyond Infinitesimal Mirror Descent »
Shahar Azulay · Edward Moroshko · Mor Shpigel Nacson · Blake Woodworth · Nati Srebro · Amir Globerson · Daniel Soudry -
2021 Oral: On the Implicit Bias of Initialization Shape: Beyond Infinitesimal Mirror Descent »
Shahar Azulay · Edward Moroshko · Mor Shpigel Nacson · Blake Woodworth · Nati Srebro · Amir Globerson · Daniel Soudry -
2020 : Keynote Session 5: Advances and Open Problems in Federated Learning, by Brendan McMahan (Google) »
Brendan McMahan -
2020 Poster: Extrapolation for Large-batch Training in Deep Learning »
Tao Lin · Lingjing Kong · Sebastian Stich · Martin Jaggi -
2020 Poster: The Complexity of Finding Stationary Points with Stochastic Gradient Descent »
Yoel Drori · Ohad Shamir -
2020 Poster: A Unified Theory of Decentralized SGD with Changing Topology and Local Updates »
Anastasiia Koloskova · Nicolas Loizou · Sadra Boreiri · Martin Jaggi · Sebastian Stich -
2020 Poster: Proving the Lottery Ticket Hypothesis: Pruning is All You Need »
Eran Malach · Gilad Yehudai · Shai Shalev-Schwartz · Ohad Shamir -
2020 Poster: Efficiently Learning Adversarially Robust Halfspaces with Noise »
Omar Montasser · Surbhi Goel · Ilias Diakonikolas · Nati Srebro -
2020 Poster: SCAFFOLD: Stochastic Controlled Averaging for Federated Learning »
Sai Praneeth Reddy Karimireddy · Satyen Kale · Mehryar Mohri · Sashank Jakkam Reddi · Sebastian Stich · Ananda Theertha Suresh -
2020 Poster: Fair Learning with Private Demographic Data »
Hussein Mozannar · Mesrob Ohannessian · Nati Srebro -
2019 : Nati Srebro: Optimization’s Untold Gift to Learning: Implicit Regularization »
Nati Srebro -
2019 : Panel Discussion (Nati Srebro, Dan Roy, Chelsea Finn, Mikhail Belkin, Aleksander Mądry, Jason Lee) »
Nati Srebro · Daniel Roy · Chelsea Finn · Mikhail Belkin · Aleksander Madry · Jason Lee -
2019 Workshop: Understanding and Improving Generalization in Deep Learning »
Dilip Krishnan · Hossein Mobahi · Behnam Neyshabur · Behnam Neyshabur · Peter Bartlett · Dawn Song · Nati Srebro -
2019 Poster: Semi-Cyclic Stochastic Gradient Descent »
Hubert Eichner · Tomer Koren · Brendan McMahan · Nati Srebro · Kunal Talwar -
2019 Oral: Semi-Cyclic Stochastic Gradient Descent »
Hubert Eichner · Tomer Koren · Brendan McMahan · Nati Srebro · Kunal Talwar -
2019 Poster: Training Well-Generalizing Classifiers for Fairness Metrics and Other Data-Dependent Constraints »
Andrew Cotter · Maya Gupta · Heinrich Jiang · Nati Srebro · Karthik Sridharan · Serena Wang · Blake Woodworth · Seungil You -
2019 Poster: Decentralized Stochastic Optimization and Gossip Algorithms with Compressed Communication »
Anastasiia Koloskova · Sebastian Stich · Martin Jaggi -
2019 Poster: Error Feedback Fixes SignSGD and other Gradient Compression Schemes »
Sai Praneeth Reddy Karimireddy · Quentin Rebjock · Sebastian Stich · Martin Jaggi -
2019 Poster: Lexicographic and Depth-Sensitive Margins in Homogeneous and Non-Homogeneous Deep Models »
Mor Shpigel Nacson · Suriya Gunasekar · Jason Lee · Nati Srebro · Daniel Soudry -
2019 Oral: Training Well-Generalizing Classifiers for Fairness Metrics and Other Data-Dependent Constraints »
Andrew Cotter · Maya Gupta · Heinrich Jiang · Nati Srebro · Karthik Sridharan · Serena Wang · Blake Woodworth · Seungil You -
2019 Oral: Decentralized Stochastic Optimization and Gossip Algorithms with Compressed Communication »
Anastasiia Koloskova · Sebastian Stich · Martin Jaggi -
2019 Oral: Lexicographic and Depth-Sensitive Margins in Homogeneous and Non-Homogeneous Deep Models »
Mor Shpigel Nacson · Suriya Gunasekar · Jason Lee · Nati Srebro · Daniel Soudry -
2019 Oral: Error Feedback Fixes SignSGD and other Gradient Compression Schemes »
Sai Praneeth Reddy Karimireddy · Quentin Rebjock · Sebastian Stich · Martin Jaggi -
2018 Poster: Spurious Local Minima are Common in Two-Layer ReLU Neural Networks »
Itay Safran · Ohad Shamir -
2018 Poster: On Matching Pursuit and Coordinate Descent »
Francesco Locatello · Anant Raj · Sai Praneeth Reddy Karimireddy · Gunnar Ratsch · Bernhard Schölkopf · Sebastian Stich · Martin Jaggi -
2018 Oral: On Matching Pursuit and Coordinate Descent »
Francesco Locatello · Anant Raj · Sai Praneeth Reddy Karimireddy · Gunnar Ratsch · Bernhard Schölkopf · Sebastian Stich · Martin Jaggi -
2018 Oral: Spurious Local Minima are Common in Two-Layer ReLU Neural Networks »
Itay Safran · Ohad Shamir -
2018 Poster: Characterizing Implicit Bias in Terms of Optimization Geometry »
Suriya Gunasekar · Jason Lee · Daniel Soudry · Nati Srebro -
2018 Oral: Characterizing Implicit Bias in Terms of Optimization Geometry »
Suriya Gunasekar · Jason Lee · Daniel Soudry · Nati Srebro -
2017 Poster: Efficient Distributed Learning with Sparsity »
Jialei Wang · Mladen Kolar · Nati Srebro · Tong Zhang -
2017 Poster: Distributed Mean Estimation with Limited Communication »
Ananda Theertha Suresh · Felix Xinnan Yu · Sanjiv Kumar · Brendan McMahan -
2017 Poster: Approximate Steepest Coordinate Descent »
Sebastian Stich · Anant Raj · Martin Jaggi -
2017 Talk: Efficient Distributed Learning with Sparsity »
Jialei Wang · Mladen Kolar · Nati Srebro · Tong Zhang -
2017 Talk: Distributed Mean Estimation with Limited Communication »
Ananda Theertha Suresh · Felix Xinnan Yu · Sanjiv Kumar · Brendan McMahan -
2017 Talk: Approximate Steepest Coordinate Descent »
Sebastian Stich · Anant Raj · Martin Jaggi -
2017 Poster: Oracle Complexity of Second-Order Methods for Finite-Sum Problems »
Yossi Arjevani · Ohad Shamir -
2017 Poster: Online Learning with Local Permutations and Delayed Feedback »
Liran Szlak · Ohad Shamir -
2017 Poster: Communication-efficient Algorithms for Distributed Stochastic Principal Component Analysis »
Dan Garber · Ohad Shamir · Nati Srebro -
2017 Poster: Depth-Width Tradeoffs in Approximating Natural Functions With Neural Networks »
Itay Safran · Ohad Shamir -
2017 Poster: Failures of Gradient-Based Deep Learning »
Shaked Shammah · Shai Shalev-Shwartz · Ohad Shamir -
2017 Talk: Depth-Width Tradeoffs in Approximating Natural Functions With Neural Networks »
Itay Safran · Ohad Shamir -
2017 Talk: Failures of Gradient-Based Deep Learning »
Shaked Shammah · Shai Shalev-Shwartz · Ohad Shamir -
2017 Talk: Oracle Complexity of Second-Order Methods for Finite-Sum Problems »
Yossi Arjevani · Ohad Shamir -
2017 Talk: Online Learning with Local Permutations and Delayed Feedback »
Liran Szlak · Ohad Shamir -
2017 Talk: Communication-efficient Algorithms for Distributed Stochastic Principal Component Analysis »
Dan Garber · Ohad Shamir · Nati Srebro