Skip to yearly menu bar Skip to main content


( events)   Timezone:  
Poster
Tue Aug 08 01:30 AM -- 05:00 AM (PDT) @ Gallery #6
Learning to Learn without Gradient Descent by Gradient Descent
Yutian Chen · Matthew Hoffman · Sergio Gómez Colmenarejo · Misha Denil · Timothy Lillicrap · Matthew Botvinick · Nando de Freitas

We learn recurrent neural network optimizers trained on simple synthetic functions by gradient descent. We show that these learned optimizers exhibit a remarkable degree of transfer in that they can be used to efficiently optimize a broad range of derivative-free black-box functions, including Gaussian process bandits, simple control objectives, global optimization benchmarks and hyper-parameter tuning tasks. Up to the training horizon, the learned optimizers learn to trade-off exploration and exploitation, and compare favourably with heavily engineered Bayesian optimization packages for hyper-parameter tuning.