Timezone: »

A Three-regime Model of Network Pruning
Yefan Zhou · Yaoqing Yang · Arin Chang · Michael Mahoney

Tue Jul 25 05:00 PM -- 06:30 PM (PDT) @ Exhibit Hall 1 #538

Recent work has highlighted the complex influence training hyperparameters, e.g., the number of training epochs, can have on the prunability of machine learning models. Perhaps surprisingly, a systematic approach to predict precisely how adjusting a specific hyperparameter will affect prunability remains elusive. To address this gap, we introduce a phenomenological model grounded in the statistical mechanics of learning. Our approach uses temperature-like and load-like parameters to model the impact of neural network (NN) training hyperparameters on pruning performance. A key empirical result we identify is a sharp transition phenomenon: depending on the value of a load-like parameter in the pruned model, increasing the value of a temperature-like parameter in the pre-pruned model may either enhance or impair subsequent pruning performance. Based on this transition, we build a three-regime model by taxonomizing the global structure of the pruned NN loss landscape. Our model reveals that the dichotomous effect of high temperature is associated with transitions between distinct types of global structures in the post-pruned model. Based on our results, we present three case-studies: 1) determining whether to increase or decrease a hyperparameter for improved pruning; 2) selecting the best model to prune from a family of models; and 3) tuning the hyperparameter of the Sharpness Aware Minimization method for better pruning performance.

Author Information

Yefan Zhou (UC Berkeley & International Computer Science Institute)
Yefan Zhou

Hi, I'm Yefan. I am an ML researcher working with the Big Data Group at International Computer Science Institute (ICSI). I earned my Master's degree in EECS at UC Berkeley where I had the pleasure of being advised by Prof. Michael Mahoney. I will join Dartmouth College as a CS Ph.D. student in Fall 2023.

Yaoqing Yang (Dartmouth College)
Arin Chang (University of California, Berkeley)
Michael Mahoney (UC Berkeley)

More from the Same Authors