Poster Mon, Jul 6, 2026 • 10:00 PM – 11:45 PM PDT HALL A #3312

AliMark: Enhancing Robustness of Sentence-Level Watermarking Against Text Paraphrasing

Yuexin Li ⋅ Wenjie Qu ⋅ Linyu Wu ⋅ Yulin Chen ⋅ Yufei He ⋅ Tri Cao ⋅ Bryan Hooi ⋅ Jiaheng Zhang

Project Page

Abstract

Existing sentence-level watermarking methods enhance robustness to paraphrasing by anchoring watermarks in sentence semantics. However, their prefix-based designs remain vulnerable to structural perturbations, such as sentence splitting and merging, which commonly arise under strong paraphrasers like DIPPER and GPT-3.5. To mitigate this issue, we propose AliMark, a framework that reformulates sentence-level watermarking as a bit sequence encoding and alignment problem between a potentially watermarked text and a secret bit sequence. Notably, our approach adopts a two-stage detection strategy: we generate multiple restructured text variants and adaptively align their extracted bit sequences with the secret bit sequence to minimize alignment cost. This multi-candidate alignment design naturally improves robustness to sentence merges and splits. Extensive experiments demonstrate that AliMark substantially outperforms state-of-the-art baselines under diverse paraphrasing attacks. Our code is available at https://github.com/imethanlee/AliMark.

Lay Summary

As AI-generated text becomes increasingly difficult to distinguish from human writing, applying watermarks to AI outputs has become a crucial tool for copyright protection, yet existing sentence-level watermarking can be easily erased by paraphrasers that simply split or merge sentences. To address this issue, we introduce AliMark, a novel watermarking framework that embeds a secret sequence of bit signals within the semantic meaning of individual sentences across the entire text. During the detection phase, the system proactively attempts to restore any altered sentence structures by re-merging or re-splitting them, and then adaptively aligns the extracted bit signals with the secret bit sequence to minimize alignment cost. This design naturally improves robustness to sentence splitting and merging. Our experiments demonstrate that even when subjected to strong text paraphrasing attacks, AliMark successfully preserves its watermark signals, offering a highly robust and reliable solution for tracing AI-generated content, protecting intellectual property, and maintaining academic integrity.