Research has proven that end-to-end malware detectors are vulnerable to adversarial attacks. In response, the research community has proposed defenses based on randomized and (de)randomized smoothing. However, these techniques remain susceptible to attacks that insert large adversarial payloads. To address these limitations, we propose a novel defense mechanism designed to harden end-to-end malware detectors by leveraging masking at the byte level. This mechanism operates by generating multiple masked versions of the input file, independently classifying each version, and then applying a threshold-based voting mechanism to produce the final classification. Key to this defense is a deterministic masking strategy that systematically strides a mask across the entire input file. Unlike randomized smoothing defenses, which randomly mask or delete bytes, this structured approach ensures coverage of the file over successive versions. In the best-case scenario, this strategy fully occludes the adversarial payload, effectively neutralizing its influence on the model's decision. In the worst-case scenario, it partially occludes the adversarial payload, reducing its impact on the model's predictions. By occluding the adversarial payload in one or more masked versions, this defense ensures that some input versions remain representative of the file's original intent, allowing the voting mechanism to suppress the influence of the adversarial payload. Results achieved on the EMBER and BODMAS datasets demonstrate the suitability of our defense, outperforming randomized and (de)randomized smoothing defenses against adversarial examples generated with a wide range of functionality-preserving manipulations while maintaining high accuracy on clean examples.
翻译:研究已证明端到端恶意软件检测器易受对抗性攻击。为此,研究界提出了基于随机化与(去)随机化平滑的防御方法。然而,这些技术仍难以抵御插入大量对抗性载荷的攻击。为克服这些局限,我们提出一种新颖的防御机制,通过利用字节级掩码来强化端到端恶意软件检测器。该机制通过生成输入文件的多个掩码版本、独立分类每个版本,并应用基于阈值的投票机制以产生最终分类。此防御的核心是一种确定性掩码策略,该方法以系统化步长将掩码覆盖整个输入文件。与随机化平滑防御(随机掩码或删除字节)不同,这种结构化方法确保了文件在连续版本中的完整覆盖。在最佳情况下,该策略能完全遮蔽对抗性载荷,有效消除其对模型决策的影响;在最差情况下,则能部分遮蔽对抗性载荷,降低其对模型预测的干扰。通过在一个或多个掩码版本中遮蔽对抗性载荷,该防御确保部分输入版本仍能代表文件的原始意图,使投票机制能够抑制对抗性载荷的影响。在EMBER和BODMAS数据集上的实验结果表明,我们的防御方法优于随机化与(去)随机化平滑防御,能有效抵御采用多种功能保持性操作生成的对抗样本,同时在干净样本上保持高准确率。