Value Function based Difference-of-Convex Algorithm for Bilevel Hyperparameter Selection Problems

Lucy Gao · Jane J. Ye · Haian Yin · Shangzhi Zeng · Jin Zhang

Room 318 - 320
[ Abstract ] [ Livestream: Visit Optimization ]
Tue 19 Jul 11:35 a.m. — 11:40 a.m. PDT
[ Slides [ Paper PDF

Existing gradient-based optimization methods for hyperparameter tuning can only guarantee theoretical convergence to stationary solutions when the bilevel program satisfies the condition that for fixed upper-level variables, the lower-level is strongly convex (LLSC) and smooth (LLS). This condition is not satisfied for bilevel programs arising from tuning hyperparameters in many machine learning algorithms. In this work, we develop a sequentially convergent Value Function based Difference-of-Convex Algorithm with inexactness (VF-iDCA). We then ask: can this algorithm achieve stationary solutions without LLSC and LLS assumptions? We provide a positive answer to this question for bilevel programs from a broad class of hyperparameter tuning applications. Extensive experiments justify our theoretical results and demonstrate the superiority of the proposed VF-iDCA when applied to tune hyperparameters.

Chat is not available.