Poster

Multicoated Supermasks Enhance Hidden Networks

Yasuyuki Okoshi ⋅ Ángel López García-Arias ⋅ Kazutoshi Hirose ⋅ Kota Ando ⋅ Kazushi Kawamura ⋅ Thiem Van Chu ⋅ Masato Motomura ⋅ Jaehoon Yu

Keywords: DL: Algorithms APP: Computer Vision Deep Learning

2022 Poster

[ Poster] [ Paper PDF]

Abstract

Hidden Networks (Ramanujan et al., 2020) showed the possibility of finding accurate subnetworks within a randomly weighted neural network by training a connectivity mask, referred to as supermask. We show that the supermask stops improving even though gradients are not zero, thus underutilizing backpropagated information. To address this we propose a method that extends Hidden Networks by training an overlay of multiple hierarchical supermasks—a multicoated supermask. This method shows that using multiple supermasks for a single task achieves higher accuracy without additional training cost. Experiments on CIFAR-10 and ImageNet show that Multicoated Supermasks enhance the tradeoff between accuracy and model size. A ResNet-101 using a 7-coated supermask outperforms its Hidden Networks counterpart by 4%, matching the accuracy of a dense ResNet-50 while being an order of magnitude smaller.

Chat is not available.