Timezone: »

On the Sparsity of Deep Neural Networks in the Overparameterized Regime: An Empirical Study
Rahul Parhi · Jack Wolf · Robert Nowak

Sparsity and low-rank structures have been incorporated into neural networks to reduce computational complexity and to improve generalization and robustness. Recent theoretical developments show that both are natural characteristics of data-fitting solutions cast in a new family of Banach spaces referred to as RBV2 spaces, the spaces of second-order bounded variation in the Radon domain. Moreover, sparse and deep ReLU networks are solutions to infinite dimensional variational problems in compositions of these spaces. This means that these learning problems can be recast as parametric optimizations over neural network weights. Remarkably, standard weight decay and variants correspond exactly to regularizing the RBV2-norm in the function space. Empirical validation in this paper confirm that weight decay leads to sparse and low-rank networks, as predicted by the theory.

Author Information

Rahul Parhi (University of Wisconsin-Madison)
Jack Wolf (University of Wisconsin-Madison)
Robert Nowak (University of Wisconsion-Madison)
Robert Nowak

Robert Nowak holds the Nosbusch Professorship in Engineering at the University of Wisconsin-Madison, where his research focuses on signal processing, machine learning, optimization, and statistics.

More from the Same Authors