Timezone: »
An Adaptive Method for Minimizing Non-negative Losses
Antonio Orvieto · Lin Xiao
This paper introduces Non-negative Gauss-Newton (NGN), an adaptive optimization method that exploits non-negativity, a common feature of the loss functions in machine learning. Utilizing a Gauss-Newton-inspired approximation for non-negative losses, NGN offers an adaptive stepsize that can automatically warm up and decay while tracking the complex loss landscapes. We provide both convergence rates and empirical evaluations, and the results are very promising compared to the classical (stochastic) gradient method both in theory and practice.
Author Information
Antonio Orvieto (ETH Zurich)
Lin Xiao (Meta)
More from the Same Authors
-
2022 : Should You Follow the Gradient Flow? Insights from Runge-Kutta Gradient Descent »
Xiang Li · Antonio Orvieto -
2023 : On the Universality of Linear Recurrences Followed by Nonlinear Projections »
Antonio Orvieto · Soham De · Razvan Pascanu · Caglar Gulcehre · Samuel Smith -
2023 : On the Advantage of Lion Compared to signSGD with Momentum »
Alessandro Noiato · Luca Biggio · Antonio Orvieto -
2022 Poster: Anticorrelated Noise Injection for Improved Generalization »
Antonio Orvieto · Hans Kersting · Frank Proske · Francis Bach · Aurelien Lucchi -
2022 Spotlight: Anticorrelated Noise Injection for Improved Generalization »
Antonio Orvieto · Hans Kersting · Frank Proske · Francis Bach · Aurelien Lucchi -
2020 Poster: An Accelerated DFO Algorithm for Finite-sum Convex Functions »
Yuwen Chen · Antonio Orvieto · Aurelien Lucchi