Poster Mon, Jul 6, 2026 • 6:30 PM – 8:15 PM PDT HALL A #1900

Power-Calibrated LLM Watermarking: A Statistical Framework

Xiaopu Wang ⋅ Zelin He ⋅ Chengyuan Liu ⋅ Runze Li

Abstract

Logit-based watermarking is a widely used mechanism for identifying LLM generated content, yet its effectiveness is governed by a fundamental trade-off between detectability and semantic distortion. Existing analyses provide limited guidance for principled hyperparameter selection, leaving practical deployments reliant on heuristic tuning. In this work, we develop a power-calibrated statistical framework that establishes explicit quantitative relationships between watermark hyperparameters, detection power, and distortion. This characterization transforms watermark design into a guided optimization problem. Building on these results, we derive practical parameter selection procedures that achieve optimal trade-offs under constraints. Extensive experiments across multiple language models and datasets validate the theory and demonstrate that the proposed framework consistently identifies Pareto-optimal points.