CATCODER：基于相关代码与类型上下文的仓库级代码生成 (CATCODER: Repository-Level Code Generation with Relevant Code and Type Context)

Large language models (LLMs) have demonstrated remarkable capabilities in code generation tasks. However, repository-level code generation presents unique challenges, particularly due to the need to utilize information spread across multiple files within a repository. Specifically, successful generation depends on a solid grasp of both general, context-agnostic knowledge and specific, context-dependent knowledge. While LLMs are widely used for the context-agnostic aspect, existing retrieval-based approaches sometimes fall short as they are limited in obtaining a broader and deeper repository context. In this paper, we present CatCoder, a novel code generation framework designed for statically typed programming languages. CatCoder enhances repository-level code generation by integrating relevant code and type context. Specifically, it leverages static analyzers to extract type dependencies and merges this information with retrieved code to create comprehensive prompts for LLMs. To evaluate the effectiveness of CatCoder, we adapt and construct benchmarks that include 199 Java tasks and 90 Rust tasks. The results show that CatCoder outperforms the RepoCoder baseline by up to 14.44% and 17.35%, in terms of compile@k and pass@k scores. In addition, the generalizability of CatCoder is assessed using various LLMs, including both code-specialized models and general-purpose models. Our findings indicate consistent performance improvements across all models, which underlines the practicality of CatCoder. Furthermore, we evaluate the time consumption of CatCoder in a large open source repository, and the results demonstrate the scalability of CatCoder.

翻译：大型语言模型（LLMs）在代码生成任务中展现出卓越能力。然而，仓库级代码生成面临独特挑战，尤其在于需要利用散布于仓库内多个文件中的信息。具体而言，成功的生成依赖于对通用、上下文无关知识以及特定、上下文相关知识的扎实掌握。尽管LLMs在上下文无关方面得到广泛应用，但现有的基于检索的方法有时表现不足，因其在获取更广泛、更深层仓库上下文方面存在局限。本文提出CatCoder，一种专为静态类型编程语言设计的新型代码生成框架。CatCoder通过整合相关代码与类型上下文来增强仓库级代码生成能力。具体而言，它利用静态分析器提取类型依赖关系，并将此信息与检索到的代码融合，以构建面向LLMs的全面提示。为评估CatCoder的有效性，我们适配并构建了包含199个Java任务和90个Rust任务的基准测试。结果显示，在compile@k和pass@k指标上，CatCoder较RepoCoder基线分别提升高达14.44%和17.35%。此外，我们使用多种LLMs（包括代码专用模型与通用模型）评估了CatCoder的泛化能力。研究结果表明所有模型均获得一致的性能提升，这凸显了CatCoder的实用性。进一步地，我们在大型开源仓库中评估了CatCoder的时间消耗，结果证明了该框架的可扩展性。