Poster Wed, Jul 8, 2026 • 5:00 PM – 6:45 PM KST Coex: HALL A

Zero Sum SVD: Balancing Loss Sensitivity for Low Rank LLM Compression

Ali Abbasi ⋅ Chayne Thrash ⋅ Haoran Qin ⋅ Shansita Sharma ⋅ Sepehr Seifi ⋅ Soheil Kolouri

Abstract

Advances in large language models have driven strong performance across many tasks, but their memory and compute costs still hinder deployment. SVD-based compression reduces storage and can speed up inference via low-rank factors, yet performance depends on how rank is allocated under a global compression ratio. Prior methods often use homogeneous ranks for similarly sized matrices, despite large differences in loss sensitivity, or rely on expensive iterative pre-truncation optimization to determine per matrix ranks. We propose Zero Sum SVD (ZS-SVD), a post-training method that performs global singular component selection using activation whitening and first-order calibration loss estimates in whitened coordinates. ZS-SVD prunes components across the whole model with a zero sum rule that keeps the cumulative predicted loss change near zero, automatically yielding heterogeneous ranks without solving a rank allocation optimization. Motivated by evidence that gradients near pretrained solutions exhibit low rank structure, we also introduce an optional lightweight correction that applies a single projected gradient update after truncation, followed by re-truncation. Extensive experiments across multiple LLM architectures show consistent gains across diverse benchmarks and compression ratios.