Poster
in
Workshop: High-dimensional Learning Dynamics Workshop: The Emergence of Structure and Reasoning
Exploring the development of complexity over depth and time in deep neural networks
Hannah Pinson · AurĂ©lien Boland · Vincent Ginis · Mykola Pechenizkiy
Neural networks obtain their expressivity from nonlinear activation functions. It is often assumed that at each layer of a network, the input is effectively transformed in a nonlinear way. However, recent results have shown that the total function implemented by the network is close to linear in the beginning of training, and only becomes more complex over time. It is unclear how the evolution of the overall function during training can be linked to changes in the effective (non)linearity of the individual network layers. In this study, we explore this evolution over time \emph{and} depth. We present a straightforward way to asses the effective linearity of layers through the use of partly linear models; in the case of our 18-layer nonlinear convolutional neural network (CNN) trained on the Imagenet dataset, we find that a large part of the layers start out in the effective linear regime, and that layers become effectively nonlinear in the direction from deep to shallow layers. We also propose an alternative method to reveal this evolution in a computationally efficient way, and we extend our experimental results to the Resnet50 architecture. For the networks and dataset we consider, our findings already indicate that the effective nonlinearity of networks changes over time and depth in a distinct wave-like pattern. The simple techniques we propose could thus help to gain valuable insights in the relationship between depth, training time, and the (in)ability to process complex datasets.