Timezone: »

Do We Actually Need Dense Over-Parameterization? In-Time Over-Parameterization in Sparse Training
Shiwei Liu · Lu Yin · Decebal Mocanu · Mykola Pechenizkiy

Wed Jul 21 09:00 PM -- 11:00 PM (PDT) @

In this paper, we introduce a new perspective on training deep neural networks capable of state-of-the-art performance without the need for the expensive over-parameterization by proposing the concept of In-Time Over-Parameterization (ITOP) in sparse training. By starting from a random sparse network and continuously exploring sparse connectivities during training, we can perform an Over-Parameterization over the course of training, closing the gap in the expressibility between sparse training and dense training. We further use ITOP to understand the underlying mechanism of Dynamic Sparse Training (DST) and discover that the benefits of DST come from its ability to consider across time all possible parameters when searching for the optimal sparse connectivity. As long as sufficient parameters have been reliably explored, DST can outperform the dense neural network by a large margin. We present a series of experiments to support our conjecture and achieve the state-of-the-art sparse training performance with ResNet-50 on ImageNet. More impressively, ITOP achieves dominant performance over the overparameterization-based sparse methods at extreme sparsities. When trained with ResNet-34 on CIFAR-100, ITOP can match the performance of the dense model at an extreme sparsity 98%.

Author Information

Shiwei Liu (Eindhoven University of Technology)

Shiwei Liu is a Postdoctoral Fellow at the University of Texas at Austin. He obtained his Ph.D. from the Eindhoven University of Technology in 2022. His research interests cover sparsity in neural networks and efficient ML. He has over 30 publications in top-tier machine learning conferences, such as IJCAI, ICLR, ICML, NeurIPS, IJCV, UAI, and LoG. Shiwei won the best paper award at the LoG’22 conference and the Cum Laude (distinguished Ph.D. thesis) at the Eindhoven University of Technology. He has served as an area chair in ICIP‘22 and ICIP’23; and a PC member of almost all top-tier ML/CV conferences. Shiwei has co-organized two tutorials in IJCAI and ECML-PKDD, which were widely acclaimed by the audience. He has also given more than 20 invited talks at many universities, companies, research labs, and conferences.

Lu Yin (Eindhoven University of Technology)
Decebal Mocanu (University of Twente)
Mykola Pechenizkiy (TU Eindhoven)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors