IVQ: Structured and Lightweight Vector Quantization via Binary Hierarchical Composition Inspired by $\textit{IChing}$
Abstract
Vector Quantization (VQ) has been widely used in visual and audio representation due to its effectiveness in compressing high-dimensional signals. However, existing VQ methods often rely on large and unstructured codebooks, which leads to inefficient code utilization and frequent codebook collapse. In this paper, we propose IChing Vector Quantization (IVQ), a lightweight and structured vector quantization framework inspired by IChing. IVQ introduces binary hierarchical composition and geometric symmetry relations into the codebook design, enabling a compact set of structured codes to represent a large number of configurations while maintaining high utilization without codebook collapse. We conduct systematic comparisons between IVQ and several VQ variants mainly focusing on audio representation. Experimental results show that IVQ achieves superior quality with significantly smaller codebooks and consistently higher utilization rates. Auxiliary experiments on visual reconstruction and cross-modal alignment further validate the universality and robustness of our structured representation.