Skip to yearly menu bar Skip to main content


Nonconvex Variance Reduced Optimization with Arbitrary Sampling

Samuel Horvath · Peter Richtarik

Pacific Ballroom #95

Keywords: [ Randomized Linear Algebra ] [ Non-convex Optimization ]


We provide the first importance sampling variants of variance reduced algorithms for empirical risk minimization with non-convex loss functions. In particular, we analyze non-convex versions of \texttt{SVRG}, \texttt{SAGA} and \texttt{SARAH}. Our methods have the capacity to speed up the training process by an order of magnitude compared to the state of the art on real datasets. Moreover, we also improve upon current mini-batch analysis of these methods by proposing importance sampling for minibatches in this setting. Surprisingly, our approach can in some regimes lead to superlinear speedup with respect to the minibatch size, which is not usually present in stochastic optimization. All the above results follow from a general analysis of the methods which works with {\em arbitrary sampling}, i.e., fully general randomized strategy for the selection of subsets of examples to be sampled in each iteration. Finally, we also perform a novel importance sampling analysis of \texttt{SARAH} in the convex setting.

Live content is unavailable. Log in and register to view live content