Language models (LMs) can generate code but cannot guarantee its correctness$\unicode{x2014}$often producing outputs that violate type safety, program invariants, or other semantic properties. Constrained decoding offers a solution by restricting generation to only produce programs that satisfy user-defined properties. However, existing methods are either limited to syntactic constraints or rely on brittle, ad hoc encodings of semantic properties over token sequences rather than program structure. We present ChopChop, the first programmable framework for constraining the output of LMs with respect to semantic properties. ChopChop introduces a principled way to construct constrained decoders based on analyzing the space of programs a prefix represents. It formulates this analysis as a realizability problem which is solved via coinduction, connecting token-level generation with structural reasoning over programs. We demonstrate ChopChop's generality by using it to enforce (1) equivalence to a reference program and (2) type safety. Across a range of models and tasks, ChopChop improves success rates while maintaining practical decoding latency.
翻译:语言模型(LMs)能够生成代码,但无法保证其正确性——常常产生违反类型安全、程序不变量或其他语义属性的输出。约束解码通过限制生成过程仅产生满足用户定义属性的程序,提供了一种解决方案。然而,现有方法要么仅限于句法约束,要么依赖于对标记序列而非程序结构的脆弱、临时性语义属性编码。我们提出了ChopChop,这是首个用于根据语义属性约束语言模型输出的可编程框架。ChopChop引入了一种基于分析前缀所代表程序空间来构建约束解码器的原则性方法。它将此分析形式化为一个可实现性问题,并通过共归纳法求解,从而将标记级生成与程序结构推理联系起来。我们通过使用ChopChop强制执行(1)与参考程序的等价性以及(2)类型安全性,展示了其通用性。在一系列模型和任务中,ChopChop在保持实际解码延迟的同时提高了成功率。