Timezone: »
Studying neural network loss landscapes provides insights into the nature of the underlying optimization problems. Unfortunately, loss landscapes are notoriously difficult to visualize in a human-comprehensible fashion. One common way to address this problem is to plot linear slices of the landscape, for example from the initial state of the network to the final state after optimization. On the basis of this analysis, prior work has drawn broader conclusions about the difficulty of the optimization problem. In this paper, we put inferences of this kind to the test, systematically evaluating how linear interpolation and final performance vary when altering the data, choice of initialization, and other optimizer and architecture design choices. Further, we use linear interpolation to study the role played by individual layers and substructures of the network. We find that certain layers are more sensitive to the choice of initialization, but that the shape of the linear path is not indicative of the changes in test accuracy of the model. Our results cast doubt on the broader intuition that the presence or absence of barriers when interpolating necessarily relates to the success of optimization.
Author Information
Tiffany Vlaar (University of Edinburgh)
Jonathan Frankle (MosaicML / Harvard)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Spotlight: What Can Linear Interpolation of Neural Network Loss Landscapes Tell Us? »
Wed. Jul 20th 02:35 -- 02:40 PM Room Ballroom 1 & 2
More from the Same Authors
-
2021 : Studying the Consistency and Composability of Lottery Ticket Pruning Masks »
Rajiv Movva · Michael Carbin · Jonathan Frankle -
2022 : Pre-Training on a Data Diet: Identifying Sufficient Examples for Early Training »
Mansheej Paul · Brett Larsen · Surya Ganguli · Jonathan Frankle · Gintare Karolina Dziugaite -
2022 : Knowledge Distillation for Efficient Sequences of Training Runs »
Xingyu Liu · Xingyu Liu · Alexander Leonardi · Alexander Leonardi · Lu Yu · Lu Yu · Christopher Gilmer-Hill · Christopher Gilmer-Hill · Matthew Leavitt · Matthew Leavitt · Jonathan Frankle · Jonathan Frankle -
2022 Poster: Multirate Training of Neural Networks »
Tiffany Vlaar · Benedict Leimkuhler -
2022 Spotlight: Multirate Training of Neural Networks »
Tiffany Vlaar · Benedict Leimkuhler -
2021 Poster: On the Predictability of Pruning Across Scales »
Jonathan Rosenfeld · Jonathan Frankle · Michael Carbin · Nir Shavit -
2021 Spotlight: On the Predictability of Pruning Across Scales »
Jonathan Rosenfeld · Jonathan Frankle · Michael Carbin · Nir Shavit -
2021 Poster: Better Training using Weight-Constrained Stochastic Dynamics »
Benedict Leimkuhler · Tiffany Vlaar · Timothée Pouchon · Amos Storkey -
2021 Spotlight: Better Training using Weight-Constrained Stochastic Dynamics »
Benedict Leimkuhler · Tiffany Vlaar · Timothée Pouchon · Amos Storkey -
2020 : Q&A: Jonathan Frankle »
Jonathan Frankle · Mayoore Jaiswal -
2020 : Contributed Talk: Jonathan Frankle »
Jonathan Frankle -
2020 Poster: Linear Mode Connectivity and the Lottery Ticket Hypothesis »
Jonathan Frankle · Gintare Karolina Dziugaite · Daniel Roy · Michael Carbin