We propose a framework for saliency-based, multi-target detection and segmentation of circular-scan, synthetic-aperture-sonar (CSAS) imagery. Our framework relies on a multi-branch, convolutional encoder-decoder network (MB-CEDN). The encoder portion of the MB-CEDN extracts visual contrast features from CSAS images. These features are fed into dual decoders that perform pixel-level segmentation to mask targets. Each decoder provides different perspectives as to what constitutes a salient target. These opinions are aggregated and cascaded into a deep-parsing network to refine the segmentation. We evaluate our framework using real-world CSAS imagery consisting of five broad target classes. We compare against existing approaches from the computer-vision literature. We show that our framework outperforms supervised, deep-saliency networks designed for natural imagery. It greatly outperforms unsupervised saliency approaches developed for natural imagery. This illustrates that natural-image-based models may need to be altered to be effective for this imaging-sonar modality.
翻译:我们提议了一个框架,用于对循环扫描、合成光学-共振成像进行显性、多目标探测和分解。我们的框架依赖于一个多分支、革命编码器-解码器网络(MB-CEDN)。MB-CEDN的编码器部分从 CSAS 图像中提取了视觉对比特征。这些特征被输入到两个解码器中,这些解码器对遮盖目标进行像素级分解。每个解码器都对什么是突出目标提供了不同的观点。这些观点被汇总并嵌入一个深层次的分解网络,以完善分解。我们用由五大类目标组成的真实世界的CSAS图像来评估我们的框架。我们比较了计算机-视觉文献中的现有方法。我们显示,我们的框架超越了为自然图像设计的受监督、深层次的网络。它大大超越了为自然图像开发的不受监督的突出特征方法。这说明,基于自然模型可能需要加以修改,以便有效地使用这种成像-共振模式。