Oral
Faster Derivative-Free Stochastic Algorithm for Shared Memory Machines
Bin Gu · Zhouyuan Huo · Cheng Deng · Heng Huang

Wed Jul 11th 11:50 AM -- 12:00 PM @ A9

Asynchronous parallel stochastic gradient optimization has been playing a pivotal role to solve large-scale machine learning problems in big data applications. Zeroth-order (derivative-free) methods estimate the gradient only by two function evaluations, thus have been applied to solve the problems where the explicit gradient calculations are computationally expensive or infeasible. Recently, the first asynchronous parallel stochastic zeroth-order algorithm (AsySZO) was proposed. However, its convergence rate is O(1/SQRT{T}) for the smooth, possibly non-convex learning problems, which is significantly slower than O(1/T) the best convergence rate of (asynchronous) stochastic gradient algorithm. To fill this gap, in this paper, we first point out the fundamental reason leading to the slow convergence rate of AsySZO, and then propose a new asynchronous stochastic zerothorder algorithm (AsySZO+). We provide a faster convergence rate O(1/bT) (b is the mini-batch size) for AsySZO+ by the rigorous theoretical analysis, which is a significant improvement over O(1/SQRT{T}). The experimental results on the application of ensemble learning confirm that our AsySZO+ has a faster convergence rate than the existing (asynchronous) stochastic zeroth-order algorithms.

Author Information

Bin Gu (University of Pittsburgh)
Zhouyuan Huo (University of Pittsburgh)
Cheng Deng (Xidian University)
Heng Huang (University of Pittsburgh)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors