Deep Discriminative Structure Proxy Hashing for Cross-modal Retrieval
Abstract
Existing proxy-based hashing methods optimize samples toward independently learned proxies using isolated similarity constraints. Although efficient, this design overlooks the fact that proxies are learned jointly but lack explicit relational or competitive interactions during optimization. Consequently, proxy responses to a sample are often accumulated rather than contrasted, leading to weakly defined decision regions and limited discriminative structure in the Hamming space. In contrast, our method organizes multiple proxies into sample-specific relational structures, enabling proxies to interact and compete when responding to each sample. Through structure-guided learning, these interactions explicitly contrast positive and negative proxy responses, thereby shaping clearer and more discriminative decision boundaries. Extensive experiments on standard cross-modal benchmarks demonstrate that this structured discrimination consistently improves retrieval accuracy and embedding separability.