Skip to yearly menu bar Skip to main content


Geometry of the Loss Landscape in Overparameterized Neural Networks: Symmetries and Invariances

Berfin Simsek · Fran├žois Ged · Arthur Jacot · Francesco Spadaro · Clement Hongler · Wulfram Gerstner · Johanni Brea

Keywords: [ Efficient Inference Methods; ] [ Algorithms -> Large Scale Learning; Applications -> Natural Language Processing; Deep Learning ] [ Algorithms ] [ Representation Learning ] [ Theory ]

Abstract: We study how permutation symmetries in overparameterized multi-layer neural networks generate `symmetry-induced' critical points. Assuming a network with $ L $ layers of minimal widths $ r_1^*, \ldots, r_{L-1}^* $ reaches a zero-loss minimum at $ r_1^*! \cdots r_{L-1}^*! $ isolated points that are permutations of one another, we show that adding one extra neuron to each layer is sufficient to connect all these previously discrete minima into a single manifold. For a two-layer overparameterized network of width $ r^*+ h =: m $ we explicitly describe the manifold of global minima: it consists of $ T(r^*, m) $ affine subspaces of dimension at least $ h $ that are connected to one another. For a network of width $m$, we identify the number $G(r,m)$ of affine subspaces containing only symmetry-induced critical points that are related to the critical points of a smaller network of width $r

Chat is not available.