Timezone: »
Federated learning is a key scenario in modern large-scale machine learning where the data remains distributed over a large number of clients and the task is to learn a centralized model without transmitting the client data. The standard optimization algorithm used in this setting is Federated Averaging (FedAvg) due to its low communication cost. We obtain a tight characterization of the convergence of FedAvg and prove that heterogeneity (non-iid-ness) in the client's data results in a `drift' in the local updates resulting in poor performance.
As a solution, we propose a new algorithm (SCAFFOLD) which uses control variates (variance reduction) to correct for the `client drift'. We prove that SCAFFOLD requires significantly fewer communication rounds and is not affected by data heterogeneity or client sampling. Further, we show that (for quadratics) SCAFFOLD can take advantage of similarity in the client's data yielding even faster convergence. The latter is the first result to quantify the usefulness of local-steps in distributed optimization.
Author Information
Sai Praneeth Reddy Karimireddy (EPFL)
Satyen Kale (Google)
Mehryar Mohri (Google Research and Courant Institute of Mathematical Sciences)
Sashank Jakkam Reddi (Google)
Sebastian Stich (EPFL)
Ananda Theertha Suresh (Google Research)
More from the Same Authors
-
2021 : Remember What You Want to Forget: Algorithms for Machine Unlearning »
Ayush Sekhari · Ayush Sekhari · Jayadev Acharya · Gautam Kamath · Ananda Theertha Suresh -
2021 : On the Renyi Differential Privacy of the Shuffle Model »
Antonious Girgis · Deepesh Data · Suhas Diggavi · Ananda Theertha Suresh · Peter Kairouz -
2021 : Learning with User-Level Privacy »
Daniel A Levy · Ziteng Sun · Kareem Amin · Satyen Kale · Alex Kulesza · Mehryar Mohri · Ananda Theertha Suresh -
2023 : Federated Heavy Hitter Recovery under Linear Sketching »
Adria Gascon · Peter Kairouz · Ziteng Sun · Ananda Suresh -
2023 : SpecTr: Fast Speculative Decoding via Optimal Transport »
Ziteng Sun · Ananda Suresh · Jae Ro · Ahmad Beirami · Himanshu Jain · Felix Xinnan Yu · Michael Riley · Sanjiv Kumar -
2023 : Ranking with Abstention »
Anqi Mao · Mehryar Mohri · Yutao Zhong -
2023 Poster: Subset-Based Instance Optimality in Private Estimation »
Travis Dick · Alex Kulesza · Ziteng Sun · Ananda Suresh -
2023 Poster: $H$-Consistency Bounds for Pairwise Misranking Loss Surrogates »
Anqi Mao · Mehryar Mohri · Yutao Zhong -
2023 Poster: Federated Heavy Hitter Recovery under Linear Sketching »
Adria Gascon · Peter Kairouz · Ziteng Sun · Ananda Suresh -
2023 Poster: Reinforcement Learning Can Be More Efficient with Multiple Rewards »
Christoph Dann · Yishay Mansour · Mehryar Mohri -
2023 Poster: Cross-Entropy Loss Functions: Theoretical Analysis and Applications »
Anqi Mao · Mehryar Mohri · Yutao Zhong -
2023 Poster: Algorithms for bounding contribution for histogram estimation under user-level privacy »
Yuhan Liu · Ananda Suresh · Wennan Zhu · Peter Kairouz · Marco Gruteser -
2023 Poster: Efficient Training of Language Models using Few-Shot Learning »
Sashank Jakkam Reddi · Sobhan Miryoosefi · Stefani Karp · Shankar Krishnan · Satyen Kale · Seungyeon Kim · Sanjiv Kumar -
2022 Poster: In defense of dual-encoders for neural ranking »
Aditya Menon · Sadeep Jayasumana · Ankit Singh Rawat · Seungyeon Kim · Sashank Jakkam Reddi · Sanjiv Kumar -
2022 Spotlight: In defense of dual-encoders for neural ranking »
Aditya Menon · Sadeep Jayasumana · Ankit Singh Rawat · Seungyeon Kim · Sashank Jakkam Reddi · Sanjiv Kumar -
2022 Poster: Agnostic Learnability of Halfspaces via Logistic Loss »
Ziwei Ji · Kwangjun Ahn · Pranjal Awasthi · Satyen Kale · Stefani Karp -
2022 Poster: The Fundamental Price of Secure Aggregation in Differentially Private Federated Learning »
Wei-Ning Chen · Christopher Choquette Choo · Peter Kairouz · Ananda Suresh -
2022 Poster: Private Adaptive Optimization with Side information »
Tian Li · Manzil Zaheer · Sashank Jakkam Reddi · Virginia Smith -
2022 Poster: Robust Training of Neural Networks Using Scale Invariant Architectures »
Zhiyuan Li · Srinadh Bhojanapalli · Manzil Zaheer · Sashank Jakkam Reddi · Sanjiv Kumar -
2022 Poster: Guarantees for Epsilon-Greedy Reinforcement Learning with Function Approximation »
Chris Dann · Yishay Mansour · Mehryar Mohri · Ayush Sekhari · Karthik Sridharan -
2022 Spotlight: Private Adaptive Optimization with Side information »
Tian Li · Manzil Zaheer · Sashank Jakkam Reddi · Virginia Smith -
2022 Spotlight: The Fundamental Price of Secure Aggregation in Differentially Private Federated Learning »
Wei-Ning Chen · Christopher Choquette Choo · Peter Kairouz · Ananda Suresh -
2022 Oral: Agnostic Learnability of Halfspaces via Logistic Loss »
Ziwei Ji · Kwangjun Ahn · Pranjal Awasthi · Satyen Kale · Stefani Karp -
2022 Spotlight: Guarantees for Epsilon-Greedy Reinforcement Learning with Function Approximation »
Chris Dann · Yishay Mansour · Mehryar Mohri · Ayush Sekhari · Karthik Sridharan -
2022 Oral: Robust Training of Neural Networks Using Scale Invariant Architectures »
Zhiyuan Li · Srinadh Bhojanapalli · Manzil Zaheer · Sashank Jakkam Reddi · Sanjiv Kumar -
2022 Poster: H-Consistency Bounds for Surrogate Loss Minimizers »
Pranjal Awasthi · Anqi Mao · Mehryar Mohri · Yutao Zhong -
2022 Poster: Correlated Quantization for Distributed Mean Estimation and Optimization »
Ananda Suresh · Ziteng Sun · Jae Ro · Felix Xinnan Yu -
2022 Oral: H-Consistency Bounds for Surrogate Loss Minimizers »
Pranjal Awasthi · Anqi Mao · Mehryar Mohri · Yutao Zhong -
2022 Spotlight: Correlated Quantization for Distributed Mean Estimation and Optimization »
Ananda Suresh · Ziteng Sun · Jae Ro · Felix Xinnan Yu -
2021 : Algorithms for Efficient Federated and Decentralized Learning (Q&A) »
Sebastian Stich -
2021 : Algorithms for Efficient Federated and Decentralized Learning »
Sebastian Stich -
2021 Spotlight: A Discriminative Technique for Multiple-Source Adaptation »
Corinna Cortes · Mehryar Mohri · Ananda Theertha Suresh · Ningshan Zhang -
2021 Poster: A Discriminative Technique for Multiple-Source Adaptation »
Corinna Cortes · Mehryar Mohri · Ananda Theertha Suresh · Ningshan Zhang -
2021 Poster: Consensus Control for Decentralized Deep Learning »
Lingjing Kong · Tao Lin · Anastasiia Koloskova · Martin Jaggi · Sebastian Stich -
2021 Poster: Quasi-global Momentum: Accelerating Decentralized Deep Learning on Heterogeneous Data »
Tao Lin · Sai Praneeth Reddy Karimireddy · Sebastian Stich · Martin Jaggi -
2021 Spotlight: Relative Deviation Margin Bounds »
Corinna Cortes · Mehryar Mohri · Ananda Theertha Suresh -
2021 Spotlight: Quasi-global Momentum: Accelerating Decentralized Deep Learning on Heterogeneous Data »
Tao Lin · Sai Praneeth Reddy Karimireddy · Sebastian Stich · Martin Jaggi -
2021 Spotlight: Consensus Control for Decentralized Deep Learning »
Lingjing Kong · Tao Lin · Anastasiia Koloskova · Martin Jaggi · Sebastian Stich -
2021 Poster: A statistical perspective on distillation »
Aditya Menon · Ankit Singh Rawat · Sashank Jakkam Reddi · Seungyeon Kim · Sanjiv Kumar -
2021 Poster: Learning from History for Byzantine Robust Optimization »
Sai Praneeth Reddy Karimireddy · Lie He · Martin Jaggi -
2021 Poster: Relative Deviation Margin Bounds »
Corinna Cortes · Mehryar Mohri · Ananda Theertha Suresh -
2021 Poster: Disentangling Sampling and Labeling Bias for Learning in Large-output Spaces »
Ankit Singh Rawat · Aditya Menon · Wittawat Jitkrittum · Sadeep Jayasumana · Felix Xinnan Yu · Sashank Jakkam Reddi · Sanjiv Kumar -
2021 Spotlight: A statistical perspective on distillation »
Aditya Menon · Ankit Singh Rawat · Sashank Jakkam Reddi · Seungyeon Kim · Sanjiv Kumar -
2021 Spotlight: Learning from History for Byzantine Robust Optimization »
Sai Praneeth Reddy Karimireddy · Lie He · Martin Jaggi -
2021 Spotlight: Disentangling Sampling and Labeling Bias for Learning in Large-output Spaces »
Ankit Singh Rawat · Aditya Menon · Wittawat Jitkrittum · Sadeep Jayasumana · Felix Xinnan Yu · Sashank Jakkam Reddi · Sanjiv Kumar -
2021 Poster: Federated Composite Optimization »
Honglin Yuan · Manzil Zaheer · Sashank Jakkam Reddi -
2021 Spotlight: Federated Composite Optimization »
Honglin Yuan · Manzil Zaheer · Sashank Jakkam Reddi -
2020 Poster: Extrapolation for Large-batch Training in Deep Learning »
Tao Lin · Lingjing Kong · Sebastian Stich · Martin Jaggi -
2020 Poster: Adaptive Region-Based Active Learning »
Corinna Cortes · Giulia DeSalvo · Claudio Gentile · Mehryar Mohri · Ningshan Zhang -
2020 Poster: Low-Rank Bottleneck in Multi-head Attention Models »
Srinadh Bhojanapalli · Chulhee Yun · Ankit Singh Rawat · Sashank Jakkam Reddi · Sanjiv Kumar -
2020 Poster: Online Learning with Dependent Stochastic Feedback Graphs »
Corinna Cortes · Giulia DeSalvo · Claudio Gentile · Mehryar Mohri · Ningshan Zhang -
2020 Poster: A Unified Theory of Decentralized SGD with Changing Topology and Local Updates »
Anastasiia Koloskova · Nicolas Loizou · Sadra Boreiri · Martin Jaggi · Sebastian Stich -
2020 Poster: Is Local SGD Better than Minibatch SGD? »
Blake Woodworth · Kumar Kshitij Patel · Sebastian Stich · Zhen Dai · Brian Bullins · Brendan McMahan · Ohad Shamir · Nati Srebro -
2020 Poster: Adversarial Learning Guarantees for Linear Hypotheses and Neural Networks »
Pranjal Awasthi · Natalie Frank · Mehryar Mohri -
2020 Poster: FedBoost: A Communication-Efficient Algorithm for Federated Learning »
Jenny Hamer · Mehryar Mohri · Ananda Theertha Suresh -
2019 : Poster Session 1 (all papers) »
Matilde Gargiani · Yochai Zur · Chaim Baskin · Evgenii Zheltonozhskii · Liam Li · Ameet Talwalkar · Xuedong Shang · Harkirat Singh Behl · Atilim Gunes Baydin · Ivo Couckuyt · Tom Dhaene · Chieh Lin · Wei Wei · Min Sun · Orchid Majumder · Michele Donini · Yoshihiko Ozaki · Ryan P. Adams · Christian Geißler · Ping Luo · zhanglin peng · · Ruimao Zhang · John Langford · Rich Caruana · Debadeepta Dey · Charles Weill · Xavi Gonzalvo · Scott Yang · Scott Yak · Eugen Hotaj · Vladimir Macko · Mehryar Mohri · Corinna Cortes · Stefan Webb · Jonathan Chen · Martin Jankowiak · Noah Goodman · Aaron Klein · Frank Hutter · Mojan Javaheripi · Mohammad Samragh · Sungbin Lim · Taesup Kim · SUNGWOONG KIM · Michael Volpp · Iddo Drori · Yamuna Krishnamurthy · Kyunghyun Cho · Stanislaw Jastrzebski · Quentin de Laroussilhe · Mingxing Tan · Xiao Ma · Neil Houlsby · Andrea Gesmundo · Zalán Borsos · Krzysztof Maziarz · Felipe Petroski Such · Joel Lehman · Kenneth Stanley · Jeff Clune · Pieter Gijsbers · Joaquin Vanschoren · Felix Mohr · Eyke Hüllermeier · Zheng Xiong · Wenpeng Zhang · Wenwu Zhu · Weijia Shao · Aleksandra Faust · Michal Valko · Michael Y Li · Hugo Jair Escalante · Marcel Wever · Andrey Khorlin · Tara Javidi · Anthony Francis · Saurajit Mukherjee · Jungtaek Kim · Michael McCourt · Saehoon Kim · Tackgeun You · Seungjin Choi · Nicolas Knudde · Alexander Tornede · Ghassen Jerfel -
2019 Poster: Escaping Saddle Points with Adaptive Gradient Methods »
Matthew Staib · Sashank Jakkam Reddi · Satyen Kale · Sanjiv Kumar · Suvrit Sra -
2019 Poster: Agnostic Federated Learning »
Mehryar Mohri · Gary Sivek · Ananda Suresh -
2019 Oral: Agnostic Federated Learning »
Mehryar Mohri · Gary Sivek · Ananda Suresh -
2019 Oral: Escaping Saddle Points with Adaptive Gradient Methods »
Matthew Staib · Sashank Jakkam Reddi · Satyen Kale · Sanjiv Kumar · Suvrit Sra -
2019 Poster: Decentralized Stochastic Optimization and Gossip Algorithms with Compressed Communication »
Anastasiia Koloskova · Sebastian Stich · Martin Jaggi -
2019 Poster: Error Feedback Fixes SignSGD and other Gradient Compression Schemes »
Sai Praneeth Reddy Karimireddy · Quentin Rebjock · Sebastian Stich · Martin Jaggi -
2019 Oral: Decentralized Stochastic Optimization and Gossip Algorithms with Compressed Communication »
Anastasiia Koloskova · Sebastian Stich · Martin Jaggi -
2019 Oral: Error Feedback Fixes SignSGD and other Gradient Compression Schemes »
Sai Praneeth Reddy Karimireddy · Quentin Rebjock · Sebastian Stich · Martin Jaggi -
2018 Poster: On Matching Pursuit and Coordinate Descent »
Francesco Locatello · Anant Raj · Sai Praneeth Reddy Karimireddy · Gunnar Ratsch · Bernhard Schölkopf · Sebastian Stich · Martin Jaggi -
2018 Oral: On Matching Pursuit and Coordinate Descent »
Francesco Locatello · Anant Raj · Sai Praneeth Reddy Karimireddy · Gunnar Ratsch · Bernhard Schölkopf · Sebastian Stich · Martin Jaggi -
2017 Poster: Distributed Mean Estimation with Limited Communication »
Ananda Theertha Suresh · Felix Xinnan Yu · Sanjiv Kumar · Brendan McMahan -
2017 Poster: Approximate Steepest Coordinate Descent »
Sebastian Stich · Anant Raj · Martin Jaggi -
2017 Poster: A Unified Maximum Likelihood Approach for Estimating Symmetric Properties of Discrete Distributions »
Jayadev Acharya · Hirakendu Das · Alon Orlitsky · Ananda Suresh -
2017 Poster: Maximum Selection and Ranking under Noisy Comparisons »
Moein Falahatgar · Alon Orlitsky · Venkatadheeraj Pichapati · Ananda Theertha Suresh -
2017 Talk: A Unified Maximum Likelihood Approach for Estimating Symmetric Properties of Discrete Distributions »
Jayadev Acharya · Hirakendu Das · Alon Orlitsky · Ananda Suresh -
2017 Talk: Distributed Mean Estimation with Limited Communication »
Ananda Theertha Suresh · Felix Xinnan Yu · Sanjiv Kumar · Brendan McMahan -
2017 Talk: Maximum Selection and Ranking under Noisy Comparisons »
Moein Falahatgar · Alon Orlitsky · Venkatadheeraj Pichapati · Ananda Theertha Suresh -
2017 Talk: Approximate Steepest Coordinate Descent »
Sebastian Stich · Anant Raj · Martin Jaggi