ICML 2022 Value Function based Difference-of-Convex Algorithm for Bilevel Hyperparameter Selection Problems Spotlight

Spotlight

Value Function based Difference-of-Convex Algorithm for Bilevel Hyperparameter Selection Problems

Lucy Gao · Jane J. Ye · Haian Yin · Shangzhi Zeng · Jin Zhang

[ Abstract ] [ Visit Optimization ]

[ Slides] [ Paper PDF]

Abstract:

Existing gradient-based optimization methods for hyperparameter tuning can only guarantee theoretical convergence to stationary solutions when the bilevel program satisfies the condition that for fixed upper-level variables, the lower-level is strongly convex (LLSC) and smooth (LLS). This condition is not satisfied for bilevel programs arising from tuning hyperparameters in many machine learning algorithms. In this work, we develop a sequentially convergent Value Function based Difference-of-Convex Algorithm with inexactness (VF-iDCA). We then ask: can this algorithm achieve stationary solutions without LLSC and LLS assumptions? We provide a positive answer to this question for bilevel programs from a broad class of hyperparameter tuning applications. Extensive experiments justify our theoretical results and demonstrate the superiority of the proposed VF-iDCA when applied to tune hyperparameters.

Chat is not available.