ICML On the Sparsity of Deep Neural Networks in the Overparameterized Regime: An Empirical Study

Poster
in
Workshop: Over-parameterization: Pitfalls and Opportunities

On the Sparsity of Deep Neural Networks in the Overparameterized Regime: An Empirical Study

Rahul Parhi · Jack Wolf · Robert Nowak

[ Abstract ]

[ Visit Poster at Spot D0 in Virtual World ]

Abstract:

Sparsity and low-rank structures have been incorporated into neural networks to reduce computational complexity and to improve generalization and robustness. Recent theoretical developments show that both are natural characteristics of data-fitting solutions cast in a new family of Banach spaces referred to as RBV2 spaces, the spaces of second-order bounded variation in the Radon domain. Moreover, sparse and deep ReLU networks are solutions to infinite dimensional variational problems in compositions of these spaces. This means that these learning problems can be recast as parametric optimizations over neural network weights. Remarkably, standard weight decay and variants correspond exactly to regularizing the RBV2-norm in the function space. Empirical validation in this paper confirm that weight decay leads to sparse and low-rank networks, as predicted by the theory.

Poster in Workshop: Over-parameterization: Pitfalls and Opportunities

On the Sparsity of Deep Neural Networks in the Overparameterized Regime: An Empirical Study

Rahul Parhi · Jack Wolf · Robert Nowak

Poster
in
Workshop: Over-parameterization: Pitfalls and Opportunities