Poster
in
Affinity Workshop: LatinX in AI (LXAI) Workshop
1-Path-Norm Regularization of Deep Neural Networks
Fabian Latorre · Antoine Bonnet · Paul Rolland · Nadav Hallak · Volkan Cevher
Keywords: [ Neural Networks ] [ Deep Learning ] [ robustness ] [ nonconvex optimization ] [ path norm ] [ generalization ]
The so-called path-norm measure is considered one of the best indicators for good generalization of neural networks. This paper introduces a proximal gradient framework for the training of deep neural networks via 1-path-norm regularization, which is applicable to general deep architectures. We address the resulting nonconvex nonsmooth optimization model by transforming the intractable induced proximal operation to an equivalent differentiable proximal operation. We compare automatic differentiation (backpropagation) algorithms with the proximal gradient framework in numerical experiments on FashionMNIST and CIFAR10. We show that 1-path-norm regularization is a better choice than weight-decay for fully connected architectures, and it improves the robustness to the presence of noisy labels. In this latter setting, the proximal gradient methods have an advantage over automatic differentiation.