Timezone: »
Deep networks are highly nonlinear and difficult to optimize. During training, the parameter iterate may move from one local basin to another, or the data distribution may even change. Inspired by the close connection between stochastic optimization and online learning, we propose a variant of the {\em follow the regularized leader} (FTRL) algorithm called {\em follow the moving leader} (FTML). Unlike the FTRL family of algorithms, the recent samples are weighted more heavily in each iteration and so FTML can adapt more quickly to changes. We show that FTML enjoys the nice properties of RMSprop and Adam, while avoiding their pitfalls. Experimental results on a number of deep learning models and tasks demonstrate that FTML converges quickly, and outperforms other state-of-the-art optimizers.
Author Information
Shuai Zheng (Hong Kong University of Science and Technology)
James Kwok (Hong Kong University of Science and Technology)
Related Events (a corresponding poster, oral, or spotlight)
-
2017 Talk: Follow the Moving Leader in Deep Learning »
Mon. Aug 7th 04:24 -- 04:42 AM Room C4.8
More from the Same Authors
-
2023 Poster: Effective Structured Prompting by Meta-Learning and Representative Verbalizer »
Weisen Jiang · Yu Zhang · James Kwok -
2023 Poster: Non-autoregressive Conditional Diffusion Models for Time Series Prediction »
Lifeng Shen · James Kwok -
2023 Poster: Nonparametric Iterative Machine Teaching »
CHEN ZHANG · Xiaofeng Cao · Weiyang Liu · Ivor Tsang · James Kwok -
2022 Poster: Subspace Learning for Effective Meta-Learning »
Weisen Jiang · James Kwok · Yu Zhang -
2022 Spotlight: Subspace Learning for Effective Meta-Learning »
Weisen Jiang · James Kwok · Yu Zhang -
2022 Poster: Efficient Variance Reduction for Meta-learning »
Hansi Yang · James Kwok -
2022 Spotlight: Efficient Variance Reduction for Meta-learning »
Hansi Yang · James Kwok -
2021 Poster: SparseBERT: Rethinking the Importance Analysis in Self-attention »
Han Shi · Jiahui Gao · Xiaozhe Ren · Hang Xu · Xiaodan Liang · Zhenguo Li · James Kwok -
2021 Spotlight: SparseBERT: Rethinking the Importance Analysis in Self-attention »
Han Shi · Jiahui Gao · Xiaozhe Ren · Hang Xu · Xiaodan Liang · Zhenguo Li · James Kwok -
2020 Poster: Searching to Exploit Memorization Effect in Learning with Noisy Labels »
QUANMING YAO · Hansi Yang · Bo Han · Gang Niu · James Kwok -
2019 Poster: Efficient Nonconvex Regularized Tensor Completion with Structure-aware Proximal Iterations »
Quanming Yao · James Kwok · Bo Han -
2019 Oral: Efficient Nonconvex Regularized Tensor Completion with Structure-aware Proximal Iterations »
Quanming Yao · James Kwok · Bo Han -
2018 Poster: Online Convolutional Sparse Coding with Sample-Dependent Dictionary »
Yaqing WANG · Quanming Yao · James Kwok · Lionel NI -
2018 Poster: Lightweight Stochastic Optimization for Minimizing Finite Sums with Infinite Data »
Shuai Zheng · James Kwok -
2018 Oral: Lightweight Stochastic Optimization for Minimizing Finite Sums with Infinite Data »
Shuai Zheng · James Kwok -
2018 Oral: Online Convolutional Sparse Coding with Sample-Dependent Dictionary »
Yaqing WANG · Quanming Yao · James Kwok · Lionel NI