Watermarking acts as a critical safeguard in text generated by Large Language Models (LLMs). By embedding identifiable signals into model outputs, watermarking enables reliable attribution and enhances the security of machine-generated content. Existing approaches typically embed signals by manipulating token generation probabilities. Despite their effectiveness, these methods inherently face a trade-off between detectability and text quality: the signal strength and randomness required for robust watermarking tend to degrade the performance of downstream tasks. In this paper, we design a novel embedding scheme that controls seed pools to facilitate diverse parallel generation of watermarked text. Based on that scheme, we propose WaterSearch, a sentence-level, search-based watermarking framework adaptable to a wide range of existing methods. WaterSearch enhances text quality by jointly optimizing two key aspects: 1) distribution fidelity and 2) watermark signal characteristics. Furthermore, WaterSearch is complemented by a sentence-level detection method with strong attack robustness. We evaluate our method on three popular LLMs across ten diverse tasks. Extensive experiments demonstrate that our method achieves an average performance improvement of 51.01\% over state-of-the-art baselines at a watermark detectability strength of 95\%. In challenging scenarios such as short text generation and low-entropy output generation, our method yields performance gains of 47.78\% and 36.47\%, respectively. Moreover, under different attack senarios including insertion, synonym substitution and paraphrase attasks, WaterSearch maintains high detectability, further validating its robust anti-attack capabilities. Our code is available at \href{https://github.com/Yukang-Lin/WaterSearch}{https://github.com/Yukang-Lin/WaterSearch}.
翻译:水印技术是大语言模型生成文本的关键防护手段。通过将可识别信号嵌入模型输出中,水印能够实现可靠溯源并增强机器生成内容的安全性。现有方法通常通过操纵词元生成概率来嵌入信号。尽管这些方法有效,但其本质上需要在可检测性与文本质量之间进行权衡:鲁棒水印所需的信号强度与随机性往往会降低下游任务的性能。本文设计了一种新颖的嵌入方案,通过控制种子池促进水印文本的多样化并行生成。基于该方案,我们提出了WaterSearch——一种适用于多种现有方法的句子级搜索式水印框架。WaterSearch通过联合优化两个关键方面来提升文本质量:1) 分布保真度;2) 水印信号特性。此外,WaterSearch还配备了具有强攻击鲁棒性的句子级检测方法。我们在三种主流大语言模型上对十个多样化任务进行了评估。大量实验表明,在95%的水印可检测强度下,我们的方法相比最先进基线平均实现了51.01%的性能提升。在短文本生成和低熵输出生成等挑战性场景中,我们的方法分别取得了47.78%和36.47%的性能增益。此外,在插入、同义词替换和复述攻击等不同攻击场景下,WaterSearch仍保持高可检测性,进一步验证了其强大的抗攻击能力。代码发布于https://github.com/Yukang-Lin/WaterSearch。