Timezone: »
Poster
Improving the Sample and Communication Complexity for Decentralized Non-Convex Optimization: Joint Gradient Estimation and Tracking
Haoran Sun · Songtao Lu · Mingyi Hong
Thu Jul 16 09:00 AM -- 09:45 AM & Thu Jul 16 08:00 PM -- 08:45 PM (PDT) @
Many modern large-scale machine learning problems benefit from decentralized and stochastic optimization. Recent works have shown that utilizing both decentralized computing and local stochastic gradient estimates can outperform state-of-the-art centralized algorithms, in applications involving highly non-convex problems, such as training deep neural networks.
In this work, we propose a decentralized stochastic algorithm to deal with certain smooth non-convex problems where there are $m$ nodes in the system, and each node has a large number of samples (denoted as $n$). Differently from the majority of the existing decentralized learning algorithms for either stochastic or finite-sum problems, our focus is given to {\it both} reducing the total communication rounds among the nodes, while accessing the minimum number of local data samples. In particular, we propose an algorithm named D-GET (decentralized gradient estimation and tracking), which jointly performs decentralized gradient estimation (which estimates the local gradient using a subset of local samples) {\it and} gradient tracking (which tracks the global full gradient using local estimates). We show that to achieve certain $\epsilon$ stationary solution of the deterministic finite sum problem, the proposed algorithm achieves an $\mathcal{O}(mn^{1/2}\epsilon^{-1})$ sample complexity and an $\mathcal{O}(\epsilon^{-1})$ communication complexity. These bounds significantly improve upon the best existing bounds of $\mathcal{O}(mn\epsilon^{-1})$ and $\mathcal{O}(\epsilon^{-1})$, respectively. Similarly, for online problems, the proposed method achieves an $\mathcal{O}(m \epsilon^{-3/2})$ sample complexity and an $\mathcal{O}(\epsilon^{-1})$ communication complexity.
Author Information
Haoran Sun (University of Minnesota)
Songtao Lu (IBM Research)
Mingyi Hong (University of Minnesota)
More from the Same Authors
-
2021 : Understanding Clipped FedAvg: Convergence and Client-Level Differential Privacy »
xinwei zhang · Xiangyi Chen · Steven Wu · Mingyi Hong -
2022 : Maximum-Likelihood Inverse Reinforcement Learning with Finite-Time Guarantees »
Siliang Zeng · Chenliang Li · Alfredo Garcia · Mingyi Hong -
2023 Poster: Linearly Constrained Bilevel Optimization: A Smoothed Implicit Gradient Approach »
Prashant Khanduri · Ioannis Tsaknakis · Yihua Zhang · Jia Liu · Sijia Liu · Jiawei Zhang · Mingyi Hong -
2023 Poster: FedAvg Converges to Zero Training Loss Linearly for Overparameterized Multi-Layer Neural Networks »
Bingqing Song · Prashant Khanduri · xinwei zhang · Jinfeng Yi · Mingyi Hong -
2023 Poster: Understanding Backdoor Attacks through the Adaptability Hypothesis »
Xun Xian · Ganghua Wang · Jayanth Srinivasa · Ashish Kundu · Xuan Bi · Mingyi Hong · Jie Ding -
2022 Poster: A Stochastic Multi-Rate Control Framework For Modeling Distributed Optimization Algorithms »
xinwei zhang · Mingyi Hong · Sairaj Dhople · Nicola Elia -
2022 Spotlight: A Stochastic Multi-Rate Control Framework For Modeling Distributed Optimization Algorithms »
xinwei zhang · Mingyi Hong · Sairaj Dhople · Nicola Elia -
2022 Poster: Understanding Clipping for Federated Learning: Convergence and Client-Level Differential Privacy »
xinwei zhang · Xiangyi Chen · Mingyi Hong · Steven Wu · Jinfeng Yi -
2022 Poster: Revisiting and Advancing Fast Adversarial Training Through The Lens of Bi-Level Optimization »
Yihua Zhang · Guanhua Zhang · Prashant Khanduri · Mingyi Hong · Shiyu Chang · Sijia Liu -
2022 Spotlight: Revisiting and Advancing Fast Adversarial Training Through The Lens of Bi-Level Optimization »
Yihua Zhang · Guanhua Zhang · Prashant Khanduri · Mingyi Hong · Shiyu Chang · Sijia Liu -
2022 Spotlight: Understanding Clipping for Federated Learning: Convergence and Client-Level Differential Privacy »
xinwei zhang · Xiangyi Chen · Mingyi Hong · Steven Wu · Jinfeng Yi -
2021 Spotlight: Decentralized Riemannian Gradient Descent on the Stiefel Manifold »
Shixiang Chen · Alfredo Garcia · Mingyi Hong · Shahin Shahrampour -
2021 Poster: Decentralized Riemannian Gradient Descent on the Stiefel Manifold »
Shixiang Chen · Alfredo Garcia · Mingyi Hong · Shahin Shahrampour -
2020 Poster: Min-Max Optimization without Gradients: Convergence and Applications to Black-Box Evasion and Poisoning Attacks »
Sijia Liu · Songtao Lu · Xiangyi Chen · Yao Feng · Kaidi Xu · Abdullah Al-Dujaili · Mingyi Hong · Una-May O'Reilly -
2019 Poster: PA-GD: On the Convergence of Perturbed Alternating Gradient Descent to Second-Order Stationary Points for Structured Nonconvex Optimization »
Songtao Lu · Mingyi Hong · Zhengdao Wang -
2019 Oral: PA-GD: On the Convergence of Perturbed Alternating Gradient Descent to Second-Order Stationary Points for Structured Nonconvex Optimization »
Songtao Lu · Mingyi Hong · Zhengdao Wang -
2018 Poster: Gradient Primal-Dual Algorithm Converges to Second-Order Stationary Solution for Nonconvex Distributed Optimization Over Networks »
Mingyi Hong · Meisam Razaviyayn · Jason Lee -
2018 Oral: Gradient Primal-Dual Algorithm Converges to Second-Order Stationary Solution for Nonconvex Distributed Optimization Over Networks »
Mingyi Hong · Meisam Razaviyayn · Jason Lee