Timezone: »
We consider the minimization of non-convex functions that typically arise in machine learning. Specifically, we focus our attention on a variant of trust region methods known as cubic regularization. This approach is particularly attractive because it escapes strict saddle points and it provides stronger convergence guarantees than first- and second-order as well as classical trust region methods. However, it suffers from a high computational complexity that makes it impractical for large-scale learning. Here, we propose a novel method that uses sub-sampling to lower this computational cost. By the use of concentration inequalities we provide a sampling scheme that gives sufficiently accurate gradient and Hessian approximations to retain the strong global and local convergence guarantees of cubically regularized methods. To the best of our knowledge this is the first work that gives global convergence guarantees for a sub-sampled variant of cubic regularization on non-convex functions. Furthermore, we provide experimental results supporting our theory.
Author Information
Jonas Kohler (ETH Zurich)
Aurelien Lucchi (ETH)
Related Events (a corresponding poster, oral, or spotlight)
-
2017 Talk: Sub-sampled Cubic Regularization for Non-convex Optimization »
Tue. Aug 8th 05:30 -- 05:48 AM Room Parkside 2
More from the Same Authors
-
2021 Spotlight: Neural Symbolic Regression that scales »
Luca Biggio · Tommaso Bendinelli · Alexander Neitz · Aurelien Lucchi · Giambattista Parascandolo -
2021 Poster: Neural Symbolic Regression that scales »
Luca Biggio · Tommaso Bendinelli · Alexander Neitz · Aurelien Lucchi · Giambattista Parascandolo -
2020 Poster: Randomized Block-Diagonal Preconditioning for Parallel Learning »
Celestine Mendler-Dünner · Aurelien Lucchi -
2020 Poster: An Accelerated DFO Algorithm for Finite-sum Convex Functions »
Yuwen Chen · Antonio Orvieto · Aurelien Lucchi -
2018 Poster: A Distributed Second-Order Algorithm You Can Trust »
Celestine Mendler-Dünner · Aurelien Lucchi · Matilde Gargiani · Yatao Bian · Thomas Hofmann · Martin Jaggi -
2018 Oral: A Distributed Second-Order Algorithm You Can Trust »
Celestine Mendler-Dünner · Aurelien Lucchi · Matilde Gargiani · Yatao Bian · Thomas Hofmann · Martin Jaggi -
2018 Poster: Escaping Saddles with Stochastic Gradients »
Hadi Daneshmand · Jonas Kohler · Aurelien Lucchi · Thomas Hofmann -
2018 Oral: Escaping Saddles with Stochastic Gradients »
Hadi Daneshmand · Jonas Kohler · Aurelien Lucchi · Thomas Hofmann