The rapid growth of Ethereum has made it more important to quickly and accurately detect smart contract vulnerabilities. While machine-learning-based methods have shown some promise, many still rely on rule-based preprocessing designed by domain experts. Rule-based preprocessing methods often discard crucial context from the source code, potentially causing certain vulnerabilities to be overlooked and limiting adaptability to newly emerging threats. We introduce BugSweeper, an end-to-end deep learning framework that detects vulnerabilities directly from the source code without manual engineering. BugSweeper represents each Solidity function as a Function-Level Abstract Syntax Graph (FLAG), a novel graph that combines its Abstract Syntax Tree (AST) with enriched control-flow and data-flow semantics. Then, our two-stage Graph Neural Network (GNN) analyzes these graphs. The first-stage GNN filters noise from the syntax graphs, while the second-stage GNN conducts high-level reasoning to detect diverse vulnerabilities. Extensive experiments on real-world contracts show that BugSweeper significantly outperforms all state-of-the-art detection methods. By removing the need for handcrafted rules, our approach offers a robust, automated, and scalable solution for securing smart contracts without any dependence on security experts.
翻译:以太坊的快速发展使得快速准确地检测智能合约漏洞变得愈发重要。尽管基于机器学习的方法已展现出一定潜力,但许多方法仍依赖于领域专家设计的基于规则的预处理流程。基于规则的预处理方法通常会丢弃源代码中的关键上下文信息,可能导致某些漏洞被忽略,并限制了对新出现威胁的适应能力。本文提出BugSweeper,一种端到端的深度学习框架,可直接从源代码检测漏洞而无需人工工程干预。BugSweeper将每个Solidity函数表示为函数级抽象语法图(FLAG),这是一种融合抽象语法树(AST)与增强的控制流、数据流语义的新型图结构。随后,我们采用两阶段图神经网络(GNN)对这些图进行分析:第一阶段GNN从语法图中滤除噪声,第二阶段GNN进行高层推理以检测多种漏洞。在真实合约上的大量实验表明,BugSweeper显著优于所有最先进的检测方法。通过消除对手工规则的依赖,我们的方法为智能合约安全提供了一种鲁棒、自动化且可扩展的解决方案,无需依赖安全专家。