This paper proposes a training data augmentation pipeline that combines synthetic image data with neural style transfer in order to address the vulnerability of deep vision models to common corruptions. We show that although applying style transfer on synthetic images degrades their quality with respect to the common FID metric, these images are surprisingly beneficial for model training. We conduct a systematic empirical analysis of the effects of both augmentations and their key hyperparameters on the performance of image classifiers. Our results demonstrate that stylization and synthetic data complement each other well and can be combined with popular rule-based data augmentation techniques such as TrivialAugment, while not working with others. Our method achieves state-of-the-art corruption robustness on several small-scale image classification benchmarks, reaching 93.54%, 74.9% and 50.86% robust accuracy on CIFAR-10-C, CIFAR-100-C and TinyImageNet-C, respectively
翻译:本文提出一种结合合成图像数据与神经风格迁移的训练数据增强流程,旨在解决深度视觉模型对常见图像损坏的脆弱性问题。研究表明,尽管对合成图像应用风格迁移会降低其在常见FID指标下的质量,但这些图像对模型训练具有出人意料的益处。我们对两种增强方法及其关键超参数对图像分类器性能的影响进行了系统的实证分析。结果表明,风格化与合成数据能够良好互补,并且可以与基于规则的流行数据增强技术(如TrivialAugment)结合使用,但与其他增强方法不兼容。我们的方法在多个小规模图像分类基准测试中实现了最先进的损坏鲁棒性,在CIFAR-10-C、CIFAR-100-C和TinyImageNet-C数据集上分别达到93.54%、74.9%和50.86%的鲁棒准确率。