Skip to yearly menu bar Skip to main content


Invited talk
in
Workshop: Over-parameterization: Pitfalls and Opportunities

Function space view of Multi-Channel Linear Convolutional Networks with Bounded Weight Norm

Suriya Gunasekar


Abstract:

The magnitude of the weights of a neural network is a fundamental measure of complexity that plays a crucial role in the study of implicit and explicit regularization. For example, in recent work, gradient descent updates in overparameterized models asymptotically lead to solutions that implicitly minimize the ell2 norm of the parameters of the model, resulting in an inductive bias that is highly architecture-dependent. To investigate the properties of learned functions, it is natural to consider a function space view given by the minimum ell2 norm of weights required to realize a given function with a given network. We call this the “induced regularizer” of the network. Building on a line of recent work, we study the induced regularizer of linear convolutional neural networks with a focus on the role of kernel size and the number of channels. We introduce an SDP relaxation of the induced regularizer, that we show is tight for networks with a single input channel. Using this SDP formulation, we show that the induced regularizer is independent of the number of the output channels for single-input channel networks, and for multi-input channel networks, we show independence given sufficiently many output channels. Moreover, we show that as the kernel size increases, the induced regularizer interpolates between a basis-invariant norm and a basis-dependent norm that promotes sparse structures in Fourier space. Based on joint work with Meena Jagadeesan and Ilya Razenshteyn.