Unified Scaling Laws for Compressed Representations
Abstract
Scaling laws have shaped recent advances in machine learning by predicting model performance based on model size, computation, and data. Concurrently, the rise in computational cost for AI has motivated model compression techniques, notably quantization and sparsification, have become essential for large-scale training and inference. This paper investigates the interplay between scaling laws and compression formats, exploring whether a unified scaling framework can accurately predict model performance when training occurs over various compressed representations, such as sparse, scalar-quantized, or sparse-quantized. We validate a general scaling law formulation and show that it is applicable both individually but also composably across compression types. Our main result is demonstrating that there exists a simple ``capacity'' metric—based on to fitting random Gaussian data—which can robustly predict parameter efficiency across multiple representations.