Robust Self-reflective Hashing for Cross-modal Retrieval with Noisy Label
Abstract
Cross-modal Hashing (CMH) typically assumes a perfectly complete data annotation, whereas noisy labels are unavoidable in practical scenarios. Existing CMH methods often overlook the uncertainty introduced by noise or semantic ambiguity, making models susceptible to overfitting noisy labels and yielding unreliable similarity judgments during inference. To address this issue, we propose a Robust Self-reflective Hashing (RSH) framework that prudently analyzes semantic discrepancies while accounting for uncertainty, thereby effectively mitigating interference from noisy labels. Specifically, the Double Feature Representation (DFR) method is introduced, employing semantic and uncertainty features to represent the semantic representation and fuzziness of samples. With a double feature, we propose a novel cross-modal similarity metric - the Self-reflective Similarity Metric (SSM), which judges sample similarity by integrating semantic discrepancy and fuzziness, enabling the model to adaptively weaken semantic discrepancy according to uncertainty level. The proposed method is plug-and-play, enabling seamless integration into diverse objective functions to enhance model robustness and reliability. Extensive experiments on benchmark datasets demonstrate that RSH outperforms existing methods.