Multimodal learning systems often face substantial uncertainty due to noisy data, low-quality labels, and heterogeneous modality characteristics. These issues become especially critical in human-computer interaction settings, where data quality, semantic reliability, and annotation consistency vary across users and recording conditions. This thesis tackles these challenges by exploring uncertainty-resilient multimodal learning through consistency-guided cross-modal transfer. The central idea is to use cross-modal semantic consistency as a basis for robust representation learning. By projecting heterogeneous modalities into a shared latent space, the proposed framework mitigates modality gaps and uncovers structural relations that support uncertainty estimation and stable feature learning. Building on this foundation, the thesis investigates strategies to enhance semantic robustness, improve data efficiency, and reduce the impact of noise and imperfect supervision without relying on large, high-quality annotations. Experiments on multimodal affect-recognition benchmarks demonstrate that consistency-guided cross-modal transfer significantly improves model stability, discriminative ability, and robustness to noisy or incomplete supervision. Latent space analyses further show that the framework captures reliable cross-modal structure even under challenging conditions. Overall, this thesis offers a unified perspective on resilient multimodal learning by integrating uncertainty modeling, semantic alignment, and data-efficient supervision, providing practical insights for developing reliable and adaptive brain-computer interface systems.
翻译:多模态学习系统常因数据噪声、低质量标注及异构模态特性面临显著不确定性。这些问题在人机交互场景中尤为突出,其中数据质量、语义可靠性及标注一致性随用户与录制条件变化而波动。本论文通过探索基于一致性引导跨模态传递的不确定性鲁棒多模态学习应对这些挑战。核心思想是以跨模态语义一致性为基础实现鲁棒表征学习。通过将异构模态投影至共享潜在空间,所提框架缓解了模态间隙,并挖掘出支持不确定性估计与稳定特征学习的结构关系。在此基础上,论文研究了增强语义鲁棒性、提升数据效率、降低噪声与不完善监督影响的策略,且无需依赖大规模高质量标注。在多模态情感识别基准测试上的实验表明,一致性引导跨模态传递显著提升了模型稳定性、判别能力及对噪声或不完整监督的鲁棒性。潜在空间分析进一步揭示,该框架即使在挑战性条件下仍能捕获可靠的跨模态结构。总体而言,本论文通过整合不确定性建模、语义对齐与数据高效监督,为鲁棒多模态学习提供了统一视角,为开发可靠自适应的脑机接口系统提供了实践启示。