ICML Poster History-Gradient Aided Batch Size Adaptation for Variance Reduced Algorithms

Poster

History-Gradient Aided Batch Size Adaptation for Variance Reduced Algorithms

Kaiyi Ji · Zhe Wang · Bowen Weng · Yi Zhou · Wei Zhang · Yingbin LIANG

Keywords: [ Non-convex Optimization ] [ Optimization - Non-convex ]

[ Abstract ]

Abstract:

Variance-reduced algorithms, although achieve great theoretical performance, can run slowly in practice due to the periodic gradient estimation with a large batch of data. Batch-size adaptation thus arises as a promising approach to accelerate such algorithms. However, existing schemes either apply prescribed batch-size adaption rule or exploit the information along optimization path via additional backtracking and condition verification steps. In this paper, we propose a novel scheme, which eliminates backtracking line search but still exploits the information along optimization path by adapting the batch size via history stochastic gradients. We further theoretically show that such a scheme substantially reduces the overall complexity for popular variance-reduced algorithms SVRG and SARAH/SPIDER for both conventional nonconvex optimization and reinforcement learning problems. To this end, we develop a new convergence analysis framework to handle the dependence of the batch size on history stochastic gradients. Extensive experiments validate the effectiveness of the proposed batch-size adaptation scheme.

Chat is not available.