Poster
in
Workshop: Differentiable Almost Everything: Differentiable Relaxations, Algorithms, Operators, and Simulators
Differentiable Soft Min-Max Loss to Restrict Weight Range for Model Quantization
Arnav Kundu · Chungkuk Yoo · Minsik Cho · Saurabh Adya
Keywords: [ soft min-max ] [ quantization ] [ weight clustering ]
The range of weights in a model disrupts effective lower bit quantization. Penalizing the range of weights improve quantization accuracy, but calculation of range (max-min) is not differentiable. In this work, we propose Differentiable Soft Min-Max Loss (DSMM) to restrict weight ranges so that we can get a quantization-friendly model which has narrow weight ranges. We apply DSMM with a learnable parameter which can adjust hardness of DSMM without requiring a special hyper-parameter. DSMM improves lower bit quantization accuracy with state-of-the-art post-training quantization (PTQ), quantization-aware training (QAT), and weight clustering across various domains and model sizes.