We propose a novel class of stochastic, adaptive methods for minimizing self-concordant functions which can be expressed as an expected value. These methods generate an estimate of the true objective function by taking the empirical mean over a sample drawn at each step, making the problem tractable. The use of adaptive step sizes eliminates the need for the user to supply a step size. Methods in this class include extensions of gradient descent (GD) and BFGS. We show that, given a suitable amount of sampling, the stochastic adaptive GD attains linear convergence in expectation, and with further sampling, the stochastic adaptive BFGS attains R-superlinear convergence. We present experiments showing that these methods compare favorably to SGD.
Chaoxu Zhou (Columbia University)
Wenbo Gao (Columbia University)
Donald Goldfarb (Columbia University)
Related Events (a corresponding poster, oral, or spotlight)
2017 Talk: Stochastic Adaptive Quasi-Newton Methods for Minimizing Expected Values »
Mon Aug 7th 07:51 -- 08:09 AM Room Parkside 2