## Geometry of the Loss Landscape in Overparameterized Neural Networks: Symmetries and Invariances

### Berfin Simsek · François Ged · Arthur Jacot · Francesco Spadaro · Clement Hongler · Wulfram Gerstner · Johanni Brea

Keywords: [ Representation Learning ] [ Algorithms ] [ Theory ] [ Algorithms -> Large Scale Learning; Applications -> Natural Language Processing; Deep Learning ] [ Efficient Inference Methods; ]

Abstract: We study how permutation symmetries in overparameterized multi-layer neural networks generate `symmetry-induced' critical points. Assuming a network with $L$ layers of minimal widths $r_1^*, \ldots, r_{L-1}^*$ reaches a zero-loss minimum at $r_1^*! \cdots r_{L-1}^*!$ isolated points that are permutations of one another, we show that adding one extra neuron to each layer is sufficient to connect all these previously discrete minima into a single manifold. For a two-layer overparameterized network of width $r^*+ h =: m$ we explicitly describe the manifold of global minima: it consists of $T(r^*, m)$ affine subspaces of dimension at least $h$ that are connected to one another. For a network of width $m$, we identify the number $G(r,m)$ of affine subspaces containing only symmetry-induced critical points that are related to the critical points of a smaller network of width \$r