Timezone: »
Finite-sum optimization problems are ubiquitous in machine learning, and are commonly solved using first-order methods which rely on gradient computations. Recently, there has been growing interest in \emph{second-order} methods, which rely on both gradients and Hessians. In principle, second-order methods can require much fewer iterations than first-order methods, and hold the promise for more efficient algorithms. Although computing and manipulating Hessians is prohibitive for high-dimensional problems in general, the Hessians of individual functions in finite-sum problems can often be efficiently computed, e.g. because they possess a low-rank structure. Can second-order information indeed be used to solve such problems more efficiently? In this paper, we provide evidence that the answer -- perhaps surprisingly -- is negative, at least in terms of worst-case guarantees. However, we also discuss what additional assumptions and algorithmic approaches might potentially circumvent this negative result.
Author Information
Yossi Arjevani (Weizmann Institute of Science)
Ohad Shamir (Weizmann Institute of Science)
Related Events (a corresponding poster, oral, or spotlight)
-
2017 Talk: Oracle Complexity of Second-Order Methods for Finite-Sum Problems »
Mon. Aug 7th 12:48 -- 01:06 AM Room Parkside 2
More from the Same Authors
-
2020 Poster: The Complexity of Finding Stationary Points with Stochastic Gradient Descent »
Yoel Drori · Ohad Shamir -
2020 Poster: Proving the Lottery Ticket Hypothesis: Pruning is All You Need »
Eran Malach · Gilad Yehudai · Shai Shalev-Schwartz · Ohad Shamir -
2020 Poster: Is Local SGD Better than Minibatch SGD? »
Blake Woodworth · Kumar Kshitij Patel · Sebastian Stich · Zhen Dai · Brian Bullins · Brendan McMahan · Ohad Shamir · Nati Srebro -
2018 Poster: Spurious Local Minima are Common in Two-Layer ReLU Neural Networks »
Itay Safran · Ohad Shamir -
2018 Oral: Spurious Local Minima are Common in Two-Layer ReLU Neural Networks »
Itay Safran · Ohad Shamir -
2017 Poster: Online Learning with Local Permutations and Delayed Feedback »
Liran Szlak · Ohad Shamir -
2017 Poster: Communication-efficient Algorithms for Distributed Stochastic Principal Component Analysis »
Dan Garber · Ohad Shamir · Nati Srebro -
2017 Poster: Depth-Width Tradeoffs in Approximating Natural Functions With Neural Networks »
Itay Safran · Ohad Shamir -
2017 Poster: Failures of Gradient-Based Deep Learning »
Shaked Shammah · Shai Shalev-Shwartz · Ohad Shamir -
2017 Talk: Depth-Width Tradeoffs in Approximating Natural Functions With Neural Networks »
Itay Safran · Ohad Shamir -
2017 Talk: Failures of Gradient-Based Deep Learning »
Shaked Shammah · Shai Shalev-Shwartz · Ohad Shamir -
2017 Talk: Online Learning with Local Permutations and Delayed Feedback »
Liran Szlak · Ohad Shamir -
2017 Talk: Communication-efficient Algorithms for Distributed Stochastic Principal Component Analysis »
Dan Garber · Ohad Shamir · Nati Srebro