ICML Poster Geometry of the Loss Landscape in Overparameterized Neural Networks: Symmetries and Invariances

Poster

Geometry of the Loss Landscape in Overparameterized Neural Networks: Symmetries and Invariances

Berfin Simsek · François Ged · Arthur Jacot · Francesco Spadaro · Clement Hongler · Wulfram Gerstner · Johanni Brea

Keywords: [ Theory ] [ Representation Learning ] [ Algorithms ] [ Algorithms -> Large Scale Learning; Applications -> Natural Language Processing; Deep Learning ] [ Efficient Inference Methods; ]

[ Abstract ] [ Paper PDF ]

[ Paper ]

[ Visit Poster at Spot C5 in Virtual World ]

Abstract: We study how permutation symmetries in overparameterized multi-layer neural networks generate `symmetry-induced' critical points. Assuming a network with

L

$L$ layers of minimal widths

r_{1}^{*}, \dots, r_{L - 1}^{*}

$r_1^*, \ldots, r_{L-1}^*$ reaches a zero-loss minimum at

r_{1}^{*}! \dots r_{L - 1}^{*}!

$r_1^*! \cdots r_{L-1}^*!$ isolated points that are permutations of one another, we show that adding one extra neuron to each layer is sufficient to connect all these previously discrete minima into a single manifold. For a two-layer overparameterized network of width

r^{*} + h =: m

$r^*+ h =: m$ we explicitly describe the manifold of global minima: it consists of

T (r^{*}, m)

$T(r^*, m)$ affine subspaces of dimension at least

h

$h$ that are connected to one another. For a network of width

m

$m$ , we identify the number

G (r, m)

$G(r,m)$ of affine subspaces containing only symmetry-induced critical points that are related to the critical points of a smaller network of width $r

Chat is not available.