LSGQuant: Layer-Sensitivity Guided Quantization for One-Step Diffusion Real-World Video Super-Resolution
Abstract
One-Step Diffusion Models have demonstrated promising capability and fast inference in real-world Video Super-Resolution (VSR). However, the substantial model size and high computational cost of Diffusion Transformers (DiTs) hinder their practical deployment. While low-bit quantization is a common approach for model compression, the effectiveness of quantized models is challenged by the high dynamic range of input latent and diverse layer behaviors. To address these limitations, we introduce LSGQuant, a layer-sensitivity guided quantization framework for one-step diffusion-based real-world VSR. Our method incorporates a Dynamic Range Adaptive Quantizer (DRAQ) to fit video token activations. Furthermore, we estimate layer sensitivity and implement a Variance-Oriented Layer Training Strategy (VOLTS) by analyzing layer-wise statistics in calibration. We also introduce Quantization-Aware Optimization (QAO) to jointly refine the quantized branch and a retained high-precision branch. Extensive experiments demonstrate that our approach maintains performance comparable to the full-precision model and significantly exceeds existing quantization techniques. All models and code will soon be publicly available.