ICML How Consensus-Based Optimization can be Interpreted as a Stochastic Relaxation of Gradient Descent

Poster
in
Workshop: Differentiable Almost Everything: Differentiable Relaxations, Algorithms, Operators, and Simulators

How Consensus-Based Optimization can be Interpreted as a Stochastic Relaxation of Gradient Descent

Konstantin Riedl · Timo Klock · Carina Geldhauser · Massimo Fornasier

Keywords: [ Stochastic Gradient Descent ] [ nonconvex optimization ] [ Interacting Multi-Particle Optimization Methods ] [ Stochastic Methods ] [ Gradient Relaxation ] [ Stochastic Relaxation ] [ multi-agent systems ] [ Optimization ] [ Machine Learning ]

[ Abstract ] [ Project Page ]

[ Poster] [ OpenReview]

Abstract:

We provide a novel analytical perspective on the theoretical understanding of gradient-based learning algorithms by interpreting consensus-based optimization (CBO), a recently proposed multi-particle derivative-free optimization method, as a stochastic relaxation of gradient descent. Remarkably, we observe that through communication of the particles, CBO exhibits a stochastic gradient descent (SGD)-like behavior despite solely relying on evaluations of the objective function. The fundamental value of such link between CBO and SGD lies in the fact that CBO is provably globally convergent to global minimizers for ample classes of nonsmooth and nonconvex objective functions. Hence, on the one side, we offer a novel explanation for the success of stochastic relaxations of gradient descent by furnishing useful and precise insights that explain how problem-tailored stochastic perturbations of gradient descent (like the ones induced by CBO) overcome energy barriers and reach deep levels of nonconvex functions. On the other side, and contrary to the conventional wisdom for which derivative-free methods ought to be inefficient or not to possess generalization abilities, our results unveil an intrinsic gradient descent nature of heuristics.

Chat is not available.

Poster in Workshop: Differentiable Almost Everything: Differentiable Relaxations, Algorithms, Operators, and Simulators

How Consensus-Based Optimization can be Interpreted as a Stochastic Relaxation of Gradient Descent

Konstantin Riedl · Timo Klock · Carina Geldhauser · Massimo Fornasier

Poster
in
Workshop: Differentiable Almost Everything: Differentiable Relaxations, Algorithms, Operators, and Simulators