ICML Poster Deep Unsupervised Hashing via External Guidance

Poster

Deep Unsupervised Hashing via External Guidance

Qihong Song · XitingLiu · Hongyuan Zhu · Joey Tianyi Zhou · Xi Peng · Peng Hu

East Exhibition Hall A-B #E-1707

[ Abstract ] [ Lay Summary ] [ Project Page ]

[ Poster] [ OpenReview]

Tue 15 Jul 11 a.m. PDT — 1:30 p.m. PDT

Abstract:

Recently, deep unsupervised hashing has gained considerable attention in image retrieval due to its advantages in cost-free data labeling, computational efficiency, and storage savings. Although existing methods achieve promising performance by leveraging inherent visual structures within the data, they primarily focus on learning discriminative features from unlabeled images through limited internal knowledge, resulting in an intrinsic upper bound on their performance. To break through this intrinsic limitation, we propose a novel method, called Deep Unsupervised Hashing with External Guidance (DUH-EG), which incorporates external textual knowledge as semantic guidance to enhance discrete representation learning. Specifically, our DUH-EG: i) selects representative semantic nouns from an external textual database by minimizing their redundancy, then matches images with them to extract more discriminative external features; and ii) presents a novel bidirectional contrastive learning mechanism to maximize agreement between hash codes in internal and external spaces, thereby capturing discrimination from both external and intrinsic structures in Hamming space. Extensive experiments on four benchmark datasets demonstrate that our DUH-EG remarkably outperforms existing state-of-the-art hashing methods.

Lay Summary:

In the digital world, quickly finding the right image from a huge collection is a big challenge. One way to do this is by using short binary codes (called “hash codes”) that help computers search faster. Many recent methods create these codes by training models directly on images without the need for human-provided labels, which saves both time and effort. However, their performance is often limited because they rely solely on the visual information within the images. To address this, we propose a novel method, called Deep Unsupervised Hashing with External Guidance (DUH-EG). Specifically, we use nouns from an external textual database to help the model better understand the content of images. By effectively compare and integrate what the model “sees” in the images with what it “knows” from nouns, it can generate more accurate and useful codes. We test our DUH-EG method on four well-known image datasets, and it clearly do better than the best methods currently available.

Chat is not available.