Skip to yearly menu bar Skip to main content


Spotlight

Adaptive Random Walk Gradient Descent for Decentralized Optimization

Tao Sun · Dongsheng Li · Bao Wang

Room 327 - 329
[ ] [ Livestream: Visit Optimization/Theory ]

Abstract:

In this paper, we study the adaptive step size random walk gradient descent with momentum for decentralized optimization, in which the training samples are drawn dependently with each other. We establish theoretical convergence rates of the adaptive step size random walk gradient descent with momentum for both convex and nonconvex settings. In particular, we prove that adaptive random walk algorithms perform as well as the non-adaptive method for dependent data in general cases but achieve acceleration when the stochastic gradients are “sparse”. Moreover, we study the zeroth-order version of adaptive random walk gradient descent and provide corresponding convergence results. All assumptions used in this paper are mild and general, making our results applicable to many machine learning problems.

Chat is not available.