Beyond Hamming: Query-Aware Decoding of Binary Cosine Sketches
DaeHun Nyang
Abstract
Cosine similarity estimation is a core primitive in coarse-to-fine retrieval pipelines, where early-stage candidate selection relies on approximate similarity estimates whose errors are amplified downstream. Widely used sign-based sketches arising from extreme quantization of random projections exhibit a structural variance peak near $\theta$ $\approx$ 90$^\circ$, the near-background region where candidate selection is most difficult. We propose QA-Cos, a query-aware decoder-side estimator that departs from the Hamming-agreement paradigm, treating sign bits as probabilistic observations rather than deterministic votes. Across simulations and BEIR benchmarks, QA-Cos reduces estimation error by up to $\sim$15--20\% in the near-orthogonal region and translates these gains into improved candidate selection in two-stage ANN pipelines, improving Hit@K by up to $\sim$30 percentage points at fixed budgets and reducing candidates by up to $\sim$45--50\% at fixed recall.
Successful Page Load