Absorbing Quantization Error by Deformable Noise Scheduler for Diffusion Models
Mingrui Yang ⋅ Wei Huang ⋅ Hao SHENG ⋅ Donglin Yang ⋅ Jichang Yang ⋅ Xin Yu ⋅ Huining Yu ⋅ Yuzhong Jiao ⋅ Zhongrui Wang ⋅ XIAOJUAN QI
Abstract
Diffusion models deliver state-of-the-art image quality but are expensive to deploy. Post-training quantization (PTQ) can shrink models and speed up inference, yet residual quantization errors distort the diffusion distribution (the timestep-wise marginal over $\vx_t$), degrading sample quality. We propose a distribution-preserving framework that absorbs quantization error into the generative process without changing architecture or adding steps. Deformable Noise Scheduler (DNS) reinterprets quantization as a principled timestep shift, mapping the quantized prediction distribution $\vx_t$ back onto the original diffusion distribution so that the target marginal is preserved. Unlike trajectory-preserving or noise-injection methods limited to stochastic samplers, our approach preserves the distribution under both stochastic and deterministic samplers and extends to flow-matching with Gaussian conditional paths. It is plug-and-play and complements existing PTQ schemes. Empirically, our method consistently enhances generation quality across diverse backbones and existing PTQ baselines. Notably, when further quantizing the FP16 LoRA branch of SVDQuant to enable fully integer inference, our approach effectively mitigates the performance drop, reducing FID from 27.16 to 26.22.
Successful Page Load