Timezone: »
Early in the training of a neural network, there exist sparse subnetworks (“winning lottery tickets”) that can be trained in isolation to match the accuracy of full, dense training (Frankle & Carbin, 2019; Frankle et al., 2020a). While this behavior was first observed for unstructured pruning, it is less clear if such subnetworks also appear in different structured pruning regimes, which have the advantage of being more computationally efficient than unstructured pruning. In this work, we show that a simple method of kernel pruning by mean magnitude, which outperforms the better-studied method of filter pruning, can also identify structured winning tickets, much like filter pruning or unstructured pruning. Moreover, we demonstrate that applying mean magnitude kernel pruning to networks early in training can achieve a higher accuracy-to-FLOPs ratio than training dense networks, filter pruning networks, or pruning networks at initialization.
Author Information
Udbhav Bamba (Indian Institute of Technology (ISM) Dhanbad)
Devin Kwok (Mila)
Gintare Karolina Dziugaite (Element AI, a ServiceNow Company)
David Rolnick (McGill University, Mila)
More from the Same Authors
-
2021 : Towards a Unified Information-Theoretic Framework for Generalization »
Mahdi Haghifam · Gintare Karolina Dziugaite · Shay Moran -
2021 : On the Generalization Improvement from Neural Network Pruning »
Tian Jin · Gintare Karolina Dziugaite · Michael Carbin -
2022 : Pre-Training on a Data Diet: Identifying Sufficient Examples for Early Training »
Mansheej Paul · Brett Larsen · Surya Ganguli · Jonathan Frankle · Gintare Karolina Dziugaite -
2023 Poster: Hidden symmetries of ReLU networks »
Elisenda Grigsby · Kathryn Lindsey · David Rolnick -
2023 Poster: FAENet: Frame Averaging Equivariant GNNs for Materials Modeling »
ALEXANDRE DUVAL · Victor Schmidt · Alex Hernandez-Garcia · Fragkiskos Malliaros · Yoshua Bengio · Santiago Miret · David Rolnick -
2023 Poster: Maximal Initial Learning Rates in Deep ReLU Networks »
Gaurav Iyer · Boris Hanin · David Rolnick -
2022 : Q&A »
Priya Donti · David Rolnick · Lynn Kaack -
2022 : Takeaways and how to get involved »
David Rolnick -
2022 : Q&A »
Priya Donti · David Rolnick · Lynn Kaack -
2022 : Q&A »
Priya Donti · David Rolnick · Lynn Kaack -
2022 : Research challenges: Generalization and causality »
David Rolnick -
2022 : Q&A »
Priya Donti · David Rolnick · Lynn Kaack -
2022 : Overview: Opportunities for machine learning in climate action »
David Rolnick -
2022 Tutorial: Climate Change and Machine Learning: Opportunities, Challenges, and Considerations »
Priya Donti · David Rolnick · Lynn Kaack -
2021 : On the Generalization Improvement from Neural Network Pruning »
Tian Jin · Gintare Karolina Dziugaite · Michael Carbin -
2021 Workshop: Tackling Climate Change with Machine Learning »
Hari Prasanna Das · Katarzyna Tokarska · Maria João Sousa · Meareg Hailemariam · David Rolnick · Xiaoxiang Zhu · Yoshua Bengio -
2020 Poster: Generalization via Derandomization »
Jeffrey Negrea · Gintare Karolina Dziugaite · Daniel Roy -
2020 Poster: Linear Mode Connectivity and the Lottery Ticket Hypothesis »
Jonathan Frankle · Gintare Karolina Dziugaite · Daniel Roy · Michael Carbin -
2018 Poster: Entropy-SGD optimizes the prior of a PAC-Bayes bound: Generalization properties of Entropy-SGD and data-dependent priors »
Gintare Karolina Dziugaite · Daniel Roy -
2018 Oral: Entropy-SGD optimizes the prior of a PAC-Bayes bound: Generalization properties of Entropy-SGD and data-dependent priors »
Gintare Karolina Dziugaite · Daniel Roy