Talk
in
Workshop: Principled Approaches to Deep Learning
Contributed Presentation 3 - Emergence of invariance and disentangling in deep representations
Emergence of invariance and disentangling in deep representations
Alessandro Achille, Stefano Soatto
We show that invariance in a deep neural network is equivalent to the information minimality of the representation it computes, and that stacking layers and injecting noise during training naturally bias the network towards learning invariant representations. Then, we show that overfitting is related to the quantity of information stored in the weights, and derive a sharp bound between this information and the minimality and Total Correlation of the layers. This allows us to conclude that implicit and explicit regularization of the loss function not only help limit overfitting, but also foster invariance and disentangling of the learned representation. We also shed light on the properties of deep networks in relation to the geometry of the loss function.