Timezone: »
Poster
A Distributed Second-Order Algorithm You Can Trust
Celestine Mendler-Dünner · Aurelien Lucchi · Matilde Gargiani · Yatao Bian · Thomas Hofmann · Martin Jaggi
Due to the rapid growth of data and computational resources, distributed optimization has become an active research area in recent years. While first-order methods seem to dominate the field, second-order methods are nevertheless attractive as they potentially require fewer communication rounds to converge. However, there are significant drawbacks that impede their wide adoption, such as the computation and the communication of a large Hessian matrix. In this paper we present a new algorithm for distributed training of generalized linear models that only requires the computation of diagonal blocks of the Hessian matrix on the individual workers. To deal with this approximate information we propose an adaptive approach that - akin to trust-region methods - dynamically adapts the auxiliary model to compensate for modeling errors. We provide theoretical rates of convergence for a wide class of problems including $L_1$-regularized objectives. We also demonstrate that our approach achieves state-of-the-art results on multiple large benchmark datasets.
Author Information
Celestine Mendler-Dünner (IBM Research)
Aurelien Lucchi (ETH Zurich)
Matilde Gargiani (University of Freiburg)
Yatao Bian (ETH Zürich)
Thomas Hofmann (ETH Zurich)
Martin Jaggi (EPFL)
Related Events (a corresponding poster, oral, or spotlight)
-
2018 Oral: A Distributed Second-Order Algorithm You Can Trust »
Thu Jul 12th 12:00 -- 12:10 PM Room A9
More from the Same Authors
-
2020 Poster: Extrapolation for Large-batch Training in Deep Learning »
Tao Lin · Lingjing Kong · Sebastian Stich · Martin Jaggi -
2020 Poster: Randomized Block-Diagonal Preconditioning for Parallel Learning »
Celestine Mendler-Dünner · Aurelien Lucchi -
2020 Poster: Optimizer Benchmarking Needs to Account for Hyperparameter Tuning »
Prabhu Teja Sivaprasad · Florian Mai · Thijs Vogels · Martin Jaggi · François Fleuret -
2020 Poster: A Unified Theory of Decentralized SGD with Changing Topology and Local Updates »
Anastasiia Koloskova · Nicolas Loizou · Sadra Boreiri · Martin Jaggi · Sebastian Stich -
2020 Poster: An Accelerated DFO Algorithm for Finite-sum Convex Functions »
Yuwen Chen · Antonio Orvieto · Aurelien Lucchi -
2019 Poster: Overcoming Multi-model Forgetting »
Yassine Benyahia · Kaicheng Yu · Kamil Bennani-Smires · Martin Jaggi · Anthony C. Davison · Mathieu Salzmann · Claudiu Musat -
2019 Poster: The Odds are Odd: A Statistical Test for Detecting Adversarial Examples »
Kevin Roth · Yannic Kilcher · Thomas Hofmann -
2019 Oral: Overcoming Multi-model Forgetting »
Yassine Benyahia · Kaicheng Yu · Kamil Bennani-Smires · Martin Jaggi · Anthony C. Davison · Mathieu Salzmann · Claudiu Musat -
2019 Oral: The Odds are Odd: A Statistical Test for Detecting Adversarial Examples »
Kevin Roth · Yannic Kilcher · Thomas Hofmann -
2019 Poster: Decentralized Stochastic Optimization and Gossip Algorithms with Compressed Communication »
Anastasiia Koloskova · Sebastian Stich · Martin Jaggi -
2019 Poster: Optimal Continuous DR-Submodular Maximization and Applications to Provable Mean Field Inference »
Yatao Bian · Joachim Buhmann · Andreas Krause -
2019 Poster: Error Feedback Fixes SignSGD and other Gradient Compression Schemes »
Sai Praneeth Reddy Karimireddy · Quentin Rebjock · Sebastian Stich · Martin Jaggi -
2019 Oral: Decentralized Stochastic Optimization and Gossip Algorithms with Compressed Communication »
Anastasiia Koloskova · Sebastian Stich · Martin Jaggi -
2019 Oral: Optimal Continuous DR-Submodular Maximization and Applications to Provable Mean Field Inference »
Yatao Bian · Joachim Buhmann · Andreas Krause -
2019 Oral: Error Feedback Fixes SignSGD and other Gradient Compression Schemes »
Sai Praneeth Reddy Karimireddy · Quentin Rebjock · Sebastian Stich · Martin Jaggi -
2018 Poster: On Matching Pursuit and Coordinate Descent »
Francesco Locatello · Anant Raj · Sai Praneeth Reddy Karimireddy · Gunnar Ratsch · Bernhard Schölkopf · Sebastian Stich · Martin Jaggi -
2018 Oral: On Matching Pursuit and Coordinate Descent »
Francesco Locatello · Anant Raj · Sai Praneeth Reddy Karimireddy · Gunnar Ratsch · Bernhard Schölkopf · Sebastian Stich · Martin Jaggi -
2018 Poster: Escaping Saddles with Stochastic Gradients »
Hadi Daneshmand · Jonas Kohler · Aurelien Lucchi · Thomas Hofmann -
2018 Poster: Hyperbolic Entailment Cones for Learning Hierarchical Embeddings »
Octavian-Eugen Ganea · Gary Becigneul · Thomas Hofmann -
2018 Oral: Escaping Saddles with Stochastic Gradients »
Hadi Daneshmand · Jonas Kohler · Aurelien Lucchi · Thomas Hofmann -
2018 Oral: Hyperbolic Entailment Cones for Learning Hierarchical Embeddings »
Octavian-Eugen Ganea · Gary Becigneul · Thomas Hofmann -
2017 Poster: Guarantees for Greedy Maximization of Non-submodular Functions with Applications »
Yatao Bian · Joachim Buhmann · Andreas Krause · Sebastian Tschiatschek -
2017 Talk: Guarantees for Greedy Maximization of Non-submodular Functions with Applications »
Yatao Bian · Joachim Buhmann · Andreas Krause · Sebastian Tschiatschek -
2017 Poster: Sub-sampled Cubic Regularization for Non-convex Optimization »
Jonas Kohler · Aurelien Lucchi -
2017 Poster: Approximate Steepest Coordinate Descent »
Sebastian Stich · Anant Raj · Martin Jaggi -
2017 Talk: Sub-sampled Cubic Regularization for Non-convex Optimization »
Jonas Kohler · Aurelien Lucchi -
2017 Talk: Approximate Steepest Coordinate Descent »
Sebastian Stich · Anant Raj · Martin Jaggi