Poster
in
Workshop: Differentiable Almost Everything: Differentiable Relaxations, Algorithms, Operators, and Simulators
DNArch: Learning Convolutional Neural Architectures by Backpropagation
David Romero · Neil Zeghidour
We present Differentiable Neural Architectures (DNArch), a method that learns the weights and the architecture of CNNs jointly by backpropagation. DNArch enables learning (i) the size of convolutional kernels, (ii) the width of all layers, (iii) the position and value of downsampling layers, and (iv) the depth of the network. DNArch treats neural architectures as continuous entities and uses learnable differentiable masks to control their size. Unlike existing methods, DNArch is not limited to a (small) predefined set of possible components, but instead it is able to discover CNN architectures across all feasible combinations of kernel sizes, widths, depths and downsampling. Empirically, DNArch finds effective architectures for classification and dense prediction tasks on sequential and image data. By adding a loss term that controls the network complexity, DNArch constrains its search to architectures that respect a predefined computational budget during training.