In the application of machine learning to remote sensing, labeled data is often scarce or expensive, which impedes the training of powerful models like deep convolutional neural networks. Although unlabeled data is abundant, recent self-supervised learning approaches are ill-suited to the remote sensing domain. In addition, most remote sensing applications currently use only a small subset of the multi-sensor, multi-channel information available, motivating the need for fused multi-sensor representations. We propose a new self-supervised training objective, Contrastive Sensor Fusion, which exploits coterminous data from multiple sources to learn useful representations of every possible combination of those sources. This method uses information common across multiple sensors and bands by training a single model to produce a representation that remains similar when any subset of its input channels is used. Using a dataset of 47 million unlabeled coterminous image triplets, we train an encoder to produce semantically meaningful representations from any possible combination of channels from the input sensors. These representations outperform fully supervised ImageNet weights on a remote sensing classification task and improve as more sensors are fused. Our code is available at https://storage.cloud.google.com/public-published-datasets/csf_code.zip.
翻译:在将机器学习应用于遥感方面,贴标签的数据往往稀缺或昂贵,妨碍了深层神经神经网络等强大模型的培训。虽然没有标签的数据很多,但最近的自监督学习方法不适合遥感领域。此外,大多数遥感应用目前只使用现有的多传感器、多通道信息中的一小部分,因此需要装配多传感器演示。我们提议了一个新的自监督培训目标,即反感应传感器聚合,利用多种来源的混合数据了解这些来源的各种可能的组合的有用表达方式。这种方法使用多种传感器和波段的通用信息,培训一个单一模型,以产生一种在使用任何一组输入渠道时仍然类似的代表方式。我们使用一套4 700万个未贴标签的多传感器图像三重数据集,从输入传感器的任何可能的组合中培养出一个具有语义意义的表达方式。这些表达方式超越了遥感分类/频谱组合的完整监控图像网络重量。我们使用的代码是MARFG/DMCMDGMRMDGMDGMDRMDRMDRDRDRDRDRDRD.ADRDRDRDRDRDRDRDRDRDRDRDRDRDRDRDRDS/MDRDRDRDRDRDRDRDRDRDRDRDRDRDRDRDRDRDRDRDRDRDRMDRDSDRDSDSDSDSDRDRDRDRDRDRDRDRDRDRMDRDRDRDRDRDRDRD。MDRDRMDRDRDMDRDRDSDSDSDSDSDSDSDSDSDSDSDSDMDRDRDSDSDSDS。MDSDSDSDSDMDSDSDSDSDRDRDGDGDMDMDGDGD。MDRD。MDSDSDS。MDS。MDSDMDSDSDSDSDSDSDS。MSDSDSDSDSDSDSDSDS