As posts on social media increase rapidly, analyzing the sentiments embedded in image-text pairs has become a popular research topic in recent years. Although existing works achieve impressive accomplishments in simultaneously harnessing image and text information, they lack the considerations of possible low-quality and missing modalities. In real-world applications, these issues might frequently occur, leading to urgent needs for models capable of predicting sentiment robustly. Therefore, we propose a Distribution-based feature Recovery and Fusion (DRF) method for robust multimodal sentiment analysis of image-text pairs. Specifically, we maintain a feature queue for each modality to approximate their feature distributions, through which we can simultaneously handle low-quality and missing modalities in a unified framework. For low-quality modalities, we reduce their contributions to the fusion by quantitatively estimating modality qualities based on the distributions. For missing modalities, we build inter-modal mapping relationships supervised by samples and distributions, thereby recovering the missing modalities from available ones. In experiments, two disruption strategies that corrupt and discard some modalities in samples are adopted to mimic the low-quality and missing modalities in various real-world scenarios. Through comprehensive experiments on three publicly available image-text datasets, we demonstrate the universal improvements of DRF compared to SOTA methods under both two strategies, validating its effectiveness in robust multimodal sentiment analysis.
翻译:随着社交媒体帖子的快速增长,分析图像-文本对中蕴含的情感已成为近年来的热门研究课题。尽管现有研究在同时利用图像和文本信息方面取得了显著成果,但缺乏对可能存在的低质量及缺失模态的考量。在实际应用中,这些问题可能频繁发生,导致对能够稳健预测情感的模型产生迫切需求。为此,我们提出一种基于分布的特征恢复与融合(DRF)方法,用于图像-文本对的鲁棒多模态情感分析。具体而言,我们为每个模态维护一个特征队列以近似其特征分布,通过该框架可统一处理低质量与缺失模态问题。针对低质量模态,我们基于分布定量估计模态质量,从而降低其对融合过程的贡献度。对于缺失模态,我们通过样本和分布监督构建跨模态映射关系,从而从可用模态中恢复缺失模态。实验中采用两种破坏策略——腐蚀样本中的部分模态及丢弃部分模态,以模拟现实场景中的低质量与缺失模态情况。通过在三个公开图像-文本数据集上的综合实验,我们证明DRF相较于现有最优方法在两种策略下均能实现普遍性能提升,验证了其在鲁棒多模态情感分析中的有效性。