Poster
in
Workshop: ES-FoMo III: 3rd Workshop on Efficient Systems for Foundation Models

Unified Scaling Laws for Compressed Representations

Andrei Panferov · Alexandra Volkova · Ionut-Vlad Modoranu · Vage Egiazarian · Mher Safaryan · Dan Alistarh

Project Page [ OpenReview]

Abstract

Scaling laws have shaped recent advances in machine learning by predicting model performance based on model size, computation, and data. Concurrently, the rise in computational cost for AI has motivated model compression techniques, notably quantization and sparsification, have become essential for large-scale training and inference. This paper investigates the interplay between scaling laws and compression formats, exploring whether a unified scaling framework can accurately predict model performance when training occurs over various compressed representations, such as sparse, scalar-quantized, or sparse-quantized. We validate a general scaling law formulation and show that it is applicable both individually but also composably across compression types. Our main result is demonstrating that there exists a simple ``capacity'' metric—based on to fitting random Gaussian data—which can robustly predict parameter efficiency across multiple representations.

Chat is not available.