Skip to yearly menu bar Skip to main content


Plenary Talk
in
Workshop: Beyond first-order methods in machine learning systems

Conjugate gradient techniques for nonconvex optimization

Clément Royer


Abstract:

Nonconvex optimization formulations have recently gained popularity due to their prevalence in deep learning applications, but they also arise in numerous problems from data analysis and robust statistics. In this setting, second-order stationary points often correspond to global optima, which motivates the development of second-order methods with fast convergence capabilities. Meanwhile, the study of acceleration phenomena, initially proposed in convex optimization, has expanded to nonconvex formulations, and lead to revisiting of popular large-scale nonlinear optimization frameworks to equip them with complexity guarantees while maintaining their practical appeal.

In this talk, we will review recent advances in employing Krylov methods to accelerate optimization procedures in a nonconvex setting. We will particularly focus on linear conjugate gradient and its variations, and we will show that this method is naturally built to detect nonconvexity on quadratic problems. We will then move to a general nonconvex nonlinear optimization setting, and describe a Newton-Conjugate Gradient method with best known complexity guarantees. Numerical illustrations of the performance of this method and extensions will be provided, with a focus on nonconvex instances in data science.