The proliferation of sensors brings an immense volume of spatio-temporal (ST) data in many domains, including monitoring, diagnostics, and prognostics applications. Data curation is a time-consuming process for a large volume of data, making it challenging and expensive to deploy data analytics platforms in new environments. Transfer learning (TL) mechanisms promise to mitigate data sparsity and model complexity by utilizing pre-trained models for a new task. Despite the triumph of TL in fields like computer vision and natural language processing, efforts on complex ST models for anomaly detection (AD) applications are limited. In this study, we present the potential of TL within the context of high-dimensional ST AD with a hybrid autoencoder architecture, incorporating convolutional, graph, and recurrent neural networks. Motivated by the need for improved model accuracy and robustness, particularly in scenarios with limited training data on systems with thousands of sensors, this research investigates the transferability of models trained on different sections of the Hadron Calorimeter of the Compact Muon Solenoid experiment at CERN. The key contributions of the study include exploring TL's potential and limitations within the context of encoder and decoder networks, revealing insights into model initialization and training configurations that enhance performance while substantially reducing trainable parameters and mitigating data contamination effects. Code: https://github.com/muleina/CMS\_HCAL\_ML\_OnlineDQM .
翻译:传感器的普及在监测、诊断和预测应用等多个领域带来了海量的时空数据。对于大规模数据,数据整理是一个耗时的过程,这使得在新环境中部署数据分析平台具有挑战性且成本高昂。迁移学习机制通过利用预训练模型执行新任务,有望缓解数据稀疏性和模型复杂性问题。尽管迁移学习在计算机视觉和自然语言处理等领域取得了显著成功,但针对异常检测应用的复杂时空模型的研究仍较为有限。本研究提出了一种结合卷积神经网络、图神经网络和循环神经网络的混合自编码器架构,探讨了在高维时空异常检测中应用迁移学习的潜力。研究动机源于提升模型准确性和鲁棒性的需求,特别是在具有数千个传感器的系统中训练数据有限的情况下。本研究重点考察了在欧洲核子研究中心紧凑μ子螺线管实验的强子量能器不同部分训练的模型的可迁移性。主要贡献包括:在编码器-解码器网络框架下探索迁移学习的潜力与局限性,揭示能提升性能的模型初始化与训练配置策略,同时显著减少可训练参数量并缓解数据污染效应。代码:https://github.com/muleina/CMS_HCAL_ML_OnlineDQM。