Provenance analysis based on system audit data has emerged as a fundamental approach for investigating Advanced Persistent Threat (APT) attacks. Due to the high concealment and long-term persistence of APT attacks, they are only represented as a minimal part of the critical path in the provenance graph. While existing techniques employ behavioral pattern matching and data flow feature matching to uncover latent associations in attack sequences through provenance graph path reasoning, their inability to establish effective attack context associations often leads to the conflation of benign system operations with real attack entities, that fail to accurately characterize real APT behaviors. We observe that while the causality of entities in the provenance graph exhibit substantial complexity, attackers often follow specific attack patterns-specifically, clear combinations of tactics and techniques to achieve their goals. Based on these insights, we propose TPPR, a novel framework that first extracts anomaly subgraphs through abnormal node detection, TTP-annotation and graph pruning, then performs attack path reasoning using mined TTP sequential pattern, and finally reconstructs attack scenarios through confidence-based path scoring and merging. Extensive evaluation on real enterprise logs (more than 100 million events) and DARPA TC dataset demonstrates TPPR's capability to achieve 99.9% graph simplification (700,000 to 20 edges) while preserving 91% of critical attack nodes, outperforming state-of-the-art solutions (SPARSE, DepImpact) by 63.1% and 67.9% in reconstruction precision while maintaining attack scenario integrity.
翻译:暂无翻译