Poster
in
Workshop: Hardware-aware efficient training (HAET)
Finding Structured Winning Tickets with Early Pruning
Udbhav Bamba · Devin Kwok · Gintare Karolina Dziugaite · David Rolnick
Early in the training of a neural network, there exist sparse subnetworks (“winning lottery tickets”) that can be trained in isolation to match the accuracy of full, dense training (Frankle & Carbin, 2019; Frankle et al., 2020a). While this behavior was first observed for unstructured pruning, it is less clear if such subnetworks also appear in different structured pruning regimes, which have the advantage of being more computationally efficient than unstructured pruning. In this work, we show that a simple method of kernel pruning by mean magnitude, which outperforms the better-studied method of filter pruning, can also identify structured winning tickets, much like filter pruning or unstructured pruning. Moreover, we demonstrate that applying mean magnitude kernel pruning to networks early in training can achieve a higher accuracy-to-FLOPs ratio than training dense networks, filter pruning networks, or pruning networks at initialization.