We present an efficient block-diagonal approximation to the Gauss-Newton matrix for feedforward neural networks. Our resulting algorithm is competitive against state-of-the-art first-order optimisation methods, with sometimes significant improvement in optimisation performance. Unlike first-order methods, for which hyperparameter tuning of the optimisation parameters is often a laborious process, our approach can provide good performance even when used with default settings. A side result of our work is that for piecewise linear transfer functions, the network objective function can have no differentiable local maxima, which may partially explain why such transfer functions facilitate effective optimisation.
Alex Botev (University College London)
Alex is a PhD student under the supervision of David Barber and part of the Machine Learning at University College London. His research interest are in the area of Large Scale Machine Learning, with main focus on Deep Learning - Variational Inference & Generative Models, NLP & Representation Learning and Optimization. Before joining UCL he graduated with a MEng degree from the University of Southampton in Computer Science with AI.
Hippolyt Ritter (University College London)
David Barber (University College London)
Related Events (a corresponding poster, oral, or spotlight)
2017 Talk: Practical Gauss-Newton Optimisation for Deep Learning »
Mon Aug 7th 08:09 -- 08:27 AM Room Parkside 2
More from the Same Authors
2020 Poster: Spread Divergence »
Mingtian Zhang · Peter Hayes · Thomas Bird · Raza Habib · David Barber