Poster
in
Workshop: Duality Principles for Modern Machine Learning
Time-Reversed Dissipation Induces Duality Between Minimizing Gradient Norm and Function Value
Jaeyeon Kim · Asuman Ozdaglar · Chanwoo Park · Ernest Ryu
Keywords: [ Gradient norm minimization ] [ Function value minimization ] [ H-Duality ]
In convex optimization, first-order optimization methods minimizing function values have been a central subject study since Nesterov's seminal work of 1983. Recently, however, Kim and Fessler's OGM-G and Lee et al.'s FISTA-G have been presented as alternatives that minimize the gradient magnitude instead. We present H-duality, which represents a surprising one-to-one correspondence between methods efficiently minimizing function values and methods efficiently minimizing gradient magnitude. In continuous-time formulations, H-duality corresponds to reversing the time dependence of the dissipation/friction term. To the best of our knowledge, H-duality is different from Lagrange/Fenchel duality and is distinct from any previously known duality or symmetry relations. Using H-duality, we obtain a clearer understanding of the symmetry between Nesterov's method and OGM-G, derive a new class of methods efficiently reducing gradient magnitudes of smooth convex functions, and find a new composite minimization method that is simpler and faster than FISTA-G.