Recent developments in the filed of Deep Learning have demonstrated that Deep Neural Networks(DNNs) are vulnerable to adversarial examples. Specifically, in image classification, an adversarial example can fool the well trained deep neural networks by adding barely imperceptible perturbations to clean images. Adversarial Training, one of the most direct and effective methods, minimizes the losses of perturbed-data to learn robust deep networks against adversarial attacks. It has been proven that using the fast gradient sign method (FGSM) can achieve Fast Adversarial Training. However, FGSM-based adversarial training may finally obtain a failed model because of overfitting to FGSM samples. In this paper, we proposed the Diversified Initialized Perturbations Adversarial Training (DIP-FAT) which involves seeking the initialization of the perturbation via enlarging the output distances of the target model in a random directions. Due to the diversity of random directions, the embedded fast adversarial training using FGSM increases the information from the adversary and reduces the possibility of overfitting. In addition to preventing overfitting, the extensive results show that our proposed DIP-FAT technique can also improve the accuracy of the clean data. The biggest advantage of DIP-FAT method: achieving the best banlance among clean-data, perturbed-data and efficiency.
翻译:深层学习文件的最近发展表明,深神经网络(DNNs)很容易成为对抗性竞争的例子,具体而言,在图像分类方面,一个对抗性的例子可能愚弄受过良好训练的深神经网络,为清洁图像添加几乎无法察觉的扰动。Aversarial培训是最直接和最有效的方法之一,最大限度地减少扰动数据的损失,以学习强大的深入网络,对抗性攻击;事实证明,使用快速梯度标志方法(FGSM)可以实现快速反向培训,但基于FGSM的对抗性培训可能最终由于过分适合FGSM样本而获得一个失败的模式。在本文中,我们建议采用多样化的初始化初步评估性评估性培训(DIP-FAT),这涉及通过随机地扩大目标模型的输出距离来启动扰动性数据。由于随机方向的多样性,使用FGSM的嵌入式快速对抗性培训可以增加敌机的信息,减少过度适应的可能性。除了防止过度适应FSM样本外,我们提出的多样化的初始化初步评估性评估(DIP)方法还可以改进最佳数据效率。