Timezone: »
Choosing the right parameters for optimization algorithms is often the key to their success in practice. Solving this problem using a learning-to-learn approach---using meta-gradient descent on a meta-objective based on the trajectory that the optimizer generates---was recently shown to be effective. However, the meta-optimization problem is difficult. In particular, the meta-gradient can often explode/vanish, and the learned optimizer may not have good generalization performance if the meta-objective is not chosen carefully. In this paper we give meta-optimization guarantees for the learning-to-learn approach on a simple problem of tuning the step size for quadratic loss. Our results show that the na\"ive objective suffers from meta-gradient explosion/vanishing problem. Although there is a way to design the meta-objective so that the meta-gradient remains polynomially bounded, computing the meta-gradient directly using backpropagation leads to numerical issues. We also characterize when it is necessary to compute the meta-objective on a separate validation set to ensure the generalization performance of the learned optimizer. Finally, we verify our results empirically and show that a similar phenomenon appears even for more complicated learned optimizers parametrized by neural networks.
Author Information
Xiang Wang (Duke University)
Shuai Yuan (Duke University)
Chenwei Wu (Duke University)
Rong Ge (Duke University)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Poster: Guarantees for Tuning the Step Size using a Learning-to-Learn Approach »
Wed. Jul 21st 04:00 -- 06:00 PM Room Virtual
More from the Same Authors
-
2023 : The Role of Linguistic Priors in Measuring Compositional Generalization of Vision-language Models »
Chenwei Wu · Li Li · Stefano Ermon · Patrick Haffner · Rong Ge · Zaiwei Zhang -
2023 Poster: Implicit Regularization Leads to Benign Overfitting for Sparse Linear Regression »
Mo Zhou · Rong Ge -
2023 Poster: Hiding Data Helps: On the Benefits of Masking for Sparse Coding »
Muthu Chidambaram · Chenwei Wu · Yu Cheng · Rong Ge -
2023 Poster: Provably Learning Diverse Features in Multi-View Data with Midpoint Mixup »
Muthu Chidambaram · Xiang Wang · Chenwei Wu · Rong Ge -
2022 Poster: Online Algorithms with Multiple Predictions »
Keerti Anand · Rong Ge · Amit Kumar · Debmalya Panigrahi -
2022 Spotlight: Online Algorithms with Multiple Predictions »
Keerti Anand · Rong Ge · Amit Kumar · Debmalya Panigrahi -
2022 Poster: Extracting Latent State Representations with Linear Dynamics from Rich Observations »
Abraham Frandsen · Rong Ge · Holden Lee -
2022 Spotlight: Extracting Latent State Representations with Linear Dynamics from Rich Observations »
Abraham Frandsen · Rong Ge · Holden Lee -
2020 Poster: High-dimensional Robust Mean Estimation via Gradient Descent »
Yu Cheng · Ilias Diakonikolas · Rong Ge · Mahdi Soltanolkotabi -
2020 Poster: Customizing ML Predictions for Online Algorithms »
Keerti Anand · Rong Ge · Debmalya Panigrahi -
2018 Poster: Global Convergence of Policy Gradient Methods for the Linear Quadratic Regulator »
Maryam Fazel · Rong Ge · Sham Kakade · Mehran Mesbahi -
2018 Oral: Global Convergence of Policy Gradient Methods for the Linear Quadratic Regulator »
Maryam Fazel · Rong Ge · Sham Kakade · Mehran Mesbahi -
2018 Poster: Stronger Generalization Bounds for Deep Nets via a Compression Approach »
Sanjeev Arora · Rong Ge · Behnam Neyshabur · Yi Zhang -
2018 Oral: Stronger Generalization Bounds for Deep Nets via a Compression Approach »
Sanjeev Arora · Rong Ge · Behnam Neyshabur · Yi Zhang -
2017 Poster: How to Escape Saddle Points Efficiently »
Chi Jin · Rong Ge · Praneeth Netrapalli · Sham Kakade · Michael Jordan -
2017 Talk: How to Escape Saddle Points Efficiently »
Chi Jin · Rong Ge · Praneeth Netrapalli · Sham Kakade · Michael Jordan -
2017 Poster: No Spurious Local Minima in Nonconvex Low Rank Problems: A Unified Geometric Analysis »
Rong Ge · Chi Jin · Yi Zheng -
2017 Poster: Generalization and Equilibrium in Generative Adversarial Nets (GANs) »
Sanjeev Arora · Rong Ge · Yingyu Liang · Tengyu Ma · Yi Zhang -
2017 Talk: No Spurious Local Minima in Nonconvex Low Rank Problems: A Unified Geometric Analysis »
Rong Ge · Chi Jin · Yi Zheng -
2017 Talk: Generalization and Equilibrium in Generative Adversarial Nets (GANs) »
Sanjeev Arora · Rong Ge · Yingyu Liang · Tengyu Ma · Yi Zhang