ICML Differentiable Soft Min-Max Loss to Restrict Weight Range for Model Quantization

Poster
in
Workshop: Differentiable Almost Everything: Differentiable Relaxations, Algorithms, Operators, and Simulators

Differentiable Soft Min-Max Loss to Restrict Weight Range for Model Quantization

Arnav Kundu · Chungkuk Yoo · Minsik Cho · Saurabh Adya

Keywords: [ soft min-max ] [ quantization ] [ weight clustering ]

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract:

The range of weights in a model disrupts effective lower bit quantization. Penalizing the range of weights improve quantization accuracy, but calculation of range (max-min) is not differentiable. In this work, we propose Differentiable Soft Min-Max Loss (DSMM) to restrict weight ranges so that we can get a quantization-friendly model which has narrow weight ranges. We apply DSMM with a learnable parameter which can adjust hardness of DSMM without requiring a special hyper-parameter. DSMM improves lower bit quantization accuracy with state-of-the-art post-training quantization (PTQ), quantization-aware training (QAT), and weight clustering across various domains and model sizes.

Chat is not available.

Poster in Workshop: Differentiable Almost Everything: Differentiable Relaxations, Algorithms, Operators, and Simulators

Differentiable Soft Min-Max Loss to Restrict Weight Range for Model Quantization

Arnav Kundu · Chungkuk Yoo · Minsik Cho · Saurabh Adya

Poster
in
Workshop: Differentiable Almost Everything: Differentiable Relaxations, Algorithms, Operators, and Simulators