Distributed in-network programs are increasingly deployed in data centers for their performance benefits, but shifting application logic to switches also enlarges the failure domain. Ensuring their correctness before deployment is thus critical for reliability. While prior verification frameworks can efficiently detect bugs for programs running on a single switch, they overlook the common interactive behaviors in distributed settings, thereby missing related bugs that can cause state inconsistencies and system failures. This paper presents Procurator, a verification framework that efficiently captures interactive behaviors in distributed in-network programs. Procurator introduces a formal model combining the actor paradigm with Communicating Sequential Processes (CSP), translating pipeline execution into reactive, event-driven actors and unifying their interactions as message passing. To support flexible specification of distributed properties, it provides a unified intent language. Additionally, it incorporates a semantic-aware state pruner to reduce verification complexity, thus ensuring system scalability. Evaluation results show that Procurator efficiently uncovers 10 distinct bugs caused by interactive behaviors across five real-world in-network systems. It also reduces verification time by up to 913.2x and memory consumption by up to 1.9x compared to the state-of-the-art verifier.
翻译:分布式网内程序因其性能优势在数据中心中日益普及,但将应用逻辑迁移至交换机的同时也扩大了故障域。因此,在部署前确保其正确性对系统可靠性至关重要。现有验证框架虽能高效检测单交换机运行程序的缺陷,却普遍忽视了分布式场景下常见的交互行为,从而遗漏了可能导致状态不一致与系统故障的相关缺陷。本文提出Procurator验证框架,可高效捕获分布式网内程序中的交互行为。该框架结合参与者范式与通信顺序进程(CSP)构建形式化模型,将流水线执行转化为反应式事件驱动参与者,并通过消息传递统一其交互过程。为支持分布式属性的灵活规约,框架提供了统一的意图描述语言。此外,通过引入语义感知的状态剪枝机制降低验证复杂度,从而保障系统可扩展性。评估结果表明:Procurator在五个真实网内系统中高效发现了10类由交互行为引发的缺陷;相较于当前最优验证器,其验证时间最多降低913.2倍,内存消耗最多减少1.9倍。