Out-of-domain (OOD) robustness is challenging to achieve in real-world computer vision applications, where shifts in image background, style, and acquisition instruments always degrade model performance. Generic augmentations show inconsistent gains under such shifts, whereas dataset-specific augmentations require expert knowledge and prior analysis. Moreover, prior studies show that neural networks adapt poorly to domain shifts because they exhibit a learning bias to domain-specific frequency components. Perturbing frequency values can mitigate such bias but overlooks pixel-level details, leading to suboptimal performance. To address these problems, we propose D-GAP (Dataset-agnostic and Gradient-guided augmentation in Amplitude and Pixel spaces), improving OOD robustness by introducing targeted augmentation in both the amplitude space (frequency space) and pixel space. Unlike conventional handcrafted augmentations, D-GAP computes sensitivity maps in the frequency space from task gradients, which reflect how strongly the model responds to different frequency components, and uses the maps to adaptively interpolate amplitudes between source and target samples. This way, D-GAP reduces the learning bias in frequency space, while a complementary pixel-space blending procedure restores fine spatial details. Extensive experiments on four real-world datasets and three domain-adaptation benchmarks show that D-GAP consistently outperforms both generic and dataset-specific augmentations, improving average OOD performance by +5.3% on real-world datasets and +1.8% on benchmark datasets.
翻译:在现实世界的计算机视觉应用中,域外(OOD)鲁棒性难以实现,图像背景、风格和采集设备的偏移常导致模型性能下降。通用增强方法在此类偏移下表现不一致,而数据集特定的增强则需要专家知识和先验分析。此外,先前研究表明,神经网络对域偏移适应能力差,因其表现出对域特定频率成分的学习偏差。扰动频率值可缓解此类偏差,但忽略了像素级细节,导致性能欠佳。为解决这些问题,我们提出D-GAP(数据集无关且梯度引导的幅度与像素空间增强),通过在幅度空间(频率空间)和像素空间引入针对性增强来提升OOD鲁棒性。与传统的基于手工设计的增强不同,D-GAP从任务梯度计算频率空间中的敏感度图,该图反映模型对不同频率成分的响应强度,并利用这些图自适应地插值源样本与目标样本间的幅度。由此,D-GAP减少了频率空间中的学习偏差,而互补的像素空间混合过程则恢复了精细的空间细节。在四个真实世界数据集和三个域适应基准上的大量实验表明,D-GAP始终优于通用和数据集特定的增强方法,在真实世界数据集上将平均OOD性能提升+5.3%,在基准数据集上提升+1.8%。