Substring-searchable symmetric encryption (substring-SSE) has become increasingly critical for privacy-preserving applications in cloud systems. However, existing schemes remain vulnerable to information leakage during search operations, particularly when adversaries possess partial knowledge of the target dataset. Although leakage-abuse attacks have been widely studied for traditional SSE, their applicability to substring-SSE under partially known data assumptions remains unexplored. In this paper, we present the first leakage-abuse attack on substring-SSE under partially-known dataset conditions. We develop a novel matrix-based correlation technique that extends and optimizes the LEAP framework for substring-SSE, enabling efficient recovery of plaintext data from encrypted suffix tree structures. Unlike existing approaches that rely on independent auxiliary datasets, our method directly exploits known data fragments to establish high-confidence mappings between ciphertext tokens and plaintext substrings through iterative matrix transformations. Comprehensive experiments on real-world datasets demonstrate the effectiveness of the attack, with recovery rates reaching 98.32% for substrings given 50% auxiliary knowledge. Even with only 10% prior knowledge, the attack achieves 74.42% substring recovery while maintaining strong scalability across datasets of varying sizes. The result reveals significant privacy risks in current substring-SSE designs and highlights the urgent need for leakage-resilient constructions.
翻译:子串可搜索对称加密(substring-SSE)在云系统的隐私保护应用中日益重要。然而,现有方案在搜索操作期间仍易受信息泄漏的影响,特别是当攻击者拥有目标数据集的部分先验知识时。尽管泄漏滥用攻击在传统SSE中已得到广泛研究,但其在部分已知数据假设下对子串-SSE的适用性尚未被探索。本文首次提出了在部分已知数据集条件下针对子串-SSE的泄漏滥用攻击。我们开发了一种新颖的基于矩阵的相关性技术,该技术扩展并优化了针对子串-SSE的LEAP框架,能够从加密的后缀树结构中高效恢复明文数据。与依赖独立辅助数据集的现有方法不同,我们的方法直接利用已知数据片段,通过迭代矩阵变换建立密文标记与明文字符串之间的高置信度映射。在真实数据集上的综合实验证明了该攻击的有效性:在给定50%辅助知识的情况下,子串恢复率达到98.32%。即使仅有10%的先验知识,该攻击仍能实现74.42%的子串恢复率,并在不同规模的数据集上保持强大的可扩展性。该结果揭示了当前子串-SSE设计中的重大隐私风险,并突显了对泄漏弹性构造的迫切需求。