Semantic segmentation is a crucial step in many Earth observation tasks. Large quantity of pixel-level annotation is required to train deep networks for semantic segmentation. Earth observation techniques are applied to varieties of applications and since classes vary widely depending on the applications, therefore, domain knowledge is often required to label Earth observation images, impeding availability of labeled training data in many Earth observation applications. To tackle these challenges, in this paper we propose an unsupervised semantic segmentation method that can be trained using just a single unlabeled scene. Remote sensing scenes are generally large. The proposed method exploits this property to sample smaller patches from the larger scene and uses deep clustering and contrastive learning to refine the weights of a lightweight deep model composed of a series of the convolution layers along with an embedded channel attention. After unsupervised training on the target image/scene, the model automatically segregates the major classes present in the scene and produces the segmentation map. Experimental results on the Vaihingen dataset demonstrate the efficacy of the proposed method.
翻译:在许多地球观测任务中,需要大量的像素级注释,以训练深海网络,进行语义分离。地球观测技术应用于各种应用,而且由于各种应用的不同,类别差别很大,因此,往往需要域知识来给地球观测图像贴上标签,从而妨碍在许多地球观测应用中提供贴标签的培训数据。为了应对这些挑战,我们在本文件中建议采用一种不受监督的语义分离方法,仅用一个未贴标签的场景即可进行培训。遥感场景一般很大。拟议方法利用这一属性从大场景取样较小部分,利用深度集群和对比性学习来改进由一系列变异层组成的轻重深重模型的重量,同时加嵌入一条通道的注意。在对目标图像/切内进行不受监督的培训之后,模型自动分离现场主要类,并制作分解图。Vaihingen数据集的实验结果展示了拟议方法的功效。