Command-line interface (CLI) fuzzing tests programs by mutating both command-line options and input file contents, thus enabling discovery of vulnerabilities that only manifest under specific option-input combinations. Prior works of CLI fuzzing face the challenges of generating semantics-rich option strings and input files, which cannot reach deeply embedded target functions. This often leads to a misdetection of such a deep vulnerability using existing CLI fuzzing techniques. In this paper, we design a novel Path-guided, Iterative LLM-Orchestrated Testing framework, called PILOT, to fuzz CLI applications. The key insight is to provide potential call paths to target functions as context to LLM so that it can better generate CLI option strings and input files. Then, PILOT iteratively repeats the process, and provides reached functions as additional context so that target functions are reached. Our evaluation on real-world CLI applications demonstrates that PILOT achieves higher coverage than state-of-the-art fuzzing approaches and discovers 51 zero-day vulnerabilities. We responsibly disclosed all the vulnerabilities to their developers and so far 41 have been confirmed by their developers with 33 being fixed and three assigned CVE identifiers.
翻译:命令行接口(CLI)模糊测试通过变异命令行选项和输入文件内容来测试程序,从而能够发现仅在特定选项与输入组合下才显现的漏洞。现有的CLI模糊测试方法面临生成语义丰富的选项字符串和输入文件的挑战,难以触及深度嵌入的目标函数,这常导致现有CLI模糊测试技术对此类深层漏洞的漏检。本文设计了一种新颖的路径引导、迭代式大语言模型编排测试框架,命名为PILOT,用于对CLI应用程序进行模糊测试。其核心思想是将目标函数的潜在调用路径作为上下文提供给大语言模型,使其能更有效地生成CLI选项字符串和输入文件。随后,PILOT迭代重复此过程,并将已触达的函数作为额外上下文提供,从而逐步抵达目标函数。我们在真实世界的CLI应用程序上的评估表明,PILOT相比最先进的模糊测试方法实现了更高的代码覆盖率,并发现了51个零日漏洞。我们已负责任地向相关开发者披露了所有漏洞,截至目前,其中41个漏洞已获开发者确认,33个已被修复,另有3个被分配了CVE标识符。