Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Neural Compression: From Information Theory to Applications

ICE-Pick: Iterative Cost-Efficient Pruning for DNNs

Wenhao Hu · Perry Gibson · José Cano


Abstract:

Pruning is one of the main compression methods for Deep Neural Networks (DNNs), where less relevant parameters are removed from the model to reduce its memory footprint. To get better final accuracy, pruning is often performed iteratively, with increasing amounts of parameters being removed in each step, and fine-tuning (i.e., additional training epochs) being applied to the remaining parameters. However, this process can be very time-consuming, since the finetuning process is applied after every pruning step, and calculates gradients for the whole model. Motivated by these overheads, in this paper we propose ICE-Pick, a novel threshold-guided finetuning method, which freezes less sensitive layers, and leverages a custom pruning-aware learning rate scheduler. We evaluate our technique using ResNet-110, ResNet-152, and MobileNetV2 (defined for CIFAR-10), and show that ICE-Pick can save up to 87.6% of the pruning time while maintaining accuracy.

Virtual talk: https://drive.google.com/file/d/1TmmRBfNXz-5hLSq6UyNqfakAw7YcE7NO/view?usp=drive_link

Chat is not available.