Poster

Contradiction Retrieval via Contrastive Learning with Sparsity

Haike Xu · Zongyu Lin · Kai-Wei Chang · Yizhou Sun · Piotr Indyk

2025 Poster

[ OpenReview]

Abstract

Contradiction retrieval refers to identifying and extracting documents that explicitly disagree with or refute the content of a query, which is important to many downstream applications like fact checking and data cleaning. To retrieve contradiction argument to the query from large document corpora, existing methods such as similarity search and cross-encoder models exhibit different limitations.To address these challenges, we introduce a novel approach: SparseCL that leverages specially trained sentence embeddings designed to preserve subtle, contradictory nuances between sentences. Our method utilizes a combined metric of cosine similarity and a sparsity function to efficiently identify and retrieve documents that contradict a given query. This approach dramatically enhances the speed of contradiction detection by reducing the need for exhaustive document comparisons to simple vector calculations. We conduct contradiction retrieval experiments on Arguana, MSMARCO, and HotpotQA, where our method produces an average improvement of $11.0\%$ across different models. We also validate our method on downstream tasks like natural language inference and cleaning corrupted corpora.This paper outlines a promising direction for non-similarity-based information retrieval which is currently underexplored.

Lay Summary

Contradiction retrieval refers to identifying and extracting documents that explicitly disagree with or refute the content of a query, which is crucial for downstream applications such as fact-checking and data cleaning. To tackle these challenges, we introduce SparseCL, a novel approach that utilizes specially trained sentence embeddings designed to capture subtle, contradictory nuances between sentences. Our method combines cosine similarity with a sparsity function to efficiently identify and retrieve documents that contradict a given query. This approach dramatically enhances the speed of contradiction detection by reducing the need for exhaustive document comparisons to simple vector calculations. This paper outlines a promising direction for non-similarity-based information retrieval, which is currently underexplored.

Video

Chat is not available.