Skip to yearly menu bar Skip to main content


Poster

Is Kernel Prediction More Powerful than Gating in Convolutional Neural Networks?

Lorenz K. Muller

Hall C 4-9 #408
[ ] [ Paper PDF ]
[ Poster
Wed 24 Jul 2:30 a.m. PDT — 4 a.m. PDT

Abstract:

Neural networks whose weights are the output of a predictor (HyperNetworks) achieve excellent performance on many tasks. In ConvNets, kernel prediction layers are a popular type of HyperNetwork. Previous theoretical work has argued that a hierarchy of multiplicative interactions exists in which gating is at the bottom and full weight prediction, as in HyperNetworks, is at the top. In this paper, we constructively demonstrate an equivalence between gating combined with fixed weight layers and weight prediction, relativizing the notion of a hierarchy of multiplicative interactions. We further derive an equivalence between a restricted type of HyperNetwork and factorization machines. Finally, we find empirically that gating layers can learn to imitate weight prediction layers with an SGD variant and show a novel practical application in image denoising using kernel prediction networks. Our reformulation of predicted kernels, combining fixed layers and gating, reduces memory requirements.

Chat is not available.