LORD-GoF: A Robust Online Detection Approach for LLM Watermarks in Sparse and Mixed Streams
Abstract
Watermarking is crucial for identifying AI-generated text, however, existing detection methods often focus on offline settings and fail to control the online False Discovery Rate (oFDR) when applied to real-world streams where machine-generated content is sparse and mixed with human writing. To address this issue, in this paper, we propose LORD-GoF, a novel online detection framework that combines a Goodness-of-Fit (GoF) statistic with the Levels based On Recent Discovery (LORD) procedure. We prove that LORD-GoF approach can rigorously control the oFDR below a user-specified level by dynamically adjusting detection thresholds. Extensive experiments on watermarked text from Qwen-2.5-3B, Sheared-LLaMA-2.7B, and OPT-1.3B using both the Gumbel-Max and Inverse Transform watermarking schemes show that our method maintains statistical power comparable to offline benchmarks while successfully controlling the oFDR under complex, mixed streaming scenarios.