A Hybrid Variance-Reduced Method for Decentralized Stochastic Non-Convex Optimization

Ran Xin · Usman Khan · Soummya Kar

Keywords: [ Distributed and Parallel Optimization ]

[ Abstract ]
[ Paper ]
[ Visit Poster at Spot A4 in Virtual World ]
Tue 20 Jul 9 a.m. PDT — 11 a.m. PDT
Spotlight presentation: Optimization (Distributed)
Tue 20 Jul 5 a.m. PDT — 6 a.m. PDT

Abstract: This paper considers decentralized stochastic optimization over a network of $n$ nodes, where each node possesses a smooth non-convex local cost function and the goal of the networked nodes is to find an $\epsilon$-accurate first-order stationary point of the sum of the local costs. We focus on an online setting, where each node accesses its local cost only by means of a stochastic first-order oracle that returns a noisy version of the exact gradient. In this context, we propose a novel single-loop decentralized hybrid variance-reduced stochastic gradient method, called GT-HSGD, that outperforms the existing approaches in terms of both the oracle complexity and practical implementation. The GT-HSGD algorithm implements specialized local hybrid stochastic gradient estimators that are fused over the network to track the global gradient. Remarkably, GT-HSGD achieves a network topology-independent oracle complexity of $O(n^{-1}\epsilon^{-3})$ when the required error tolerance $\epsilon$ is small enough, leading to a linear speedup with respect to the centralized optimal online variance-reduced approaches that operate on a single node. Numerical experiments are provided to illustrate our main technical results.

Chat is not available.