ICML DNArch: Learning Convolutional Neural Architectures by Backpropagation

Poster
in
Workshop: Differentiable Almost Everything: Differentiable Relaxations, Algorithms, Operators, and Simulators

DNArch: Learning Convolutional Neural Architectures by Backpropagation

David Romero · Neil Zeghidour

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract:

We present Differentiable Neural Architectures (DNArch), a method that learns the weights and the architecture of CNNs jointly by backpropagation. DNArch enables learning (i) the size of convolutional kernels, (ii) the width of all layers, (iii) the position and value of downsampling layers, and (iv) the depth of the network. DNArch treats neural architectures as continuous entities and uses learnable differentiable masks to control their size. Unlike existing methods, DNArch is not limited to a (small) predefined set of possible components, but instead it is able to discover CNN architectures across all feasible combinations of kernel sizes, widths, depths and downsampling. Empirically, DNArch finds effective architectures for classification and dense prediction tasks on sequential and image data. By adding a loss term that controls the network complexity, DNArch constrains its search to architectures that respect a predefined computational budget during training.

Chat is not available.

Poster in Workshop: Differentiable Almost Everything: Differentiable Relaxations, Algorithms, Operators, and Simulators

DNArch: Learning Convolutional Neural Architectures by Backpropagation

David Romero · Neil Zeghidour

Poster
in
Workshop: Differentiable Almost Everything: Differentiable Relaxations, Algorithms, Operators, and Simulators