MSAD：时间序列异常检测模型选择的深度探究 (MSAD: A Deep Dive into Model Selection for Time series Anomaly Detection)

Anomaly detection is a fundamental task for time series analytics with important implications for the downstream performance of many applications. Despite increasing academic interest and the large number of methods proposed in the literature, recent benchmarks and evaluation studies demonstrated that no overall best anomaly detection methods exist when applied to very heterogeneous time series datasets. Therefore, the only scalable and viable solution to solve anomaly detection over very different time series collected from diverse domains is to propose a model selection method that will select, based on time series characteristics, the best anomaly detection methods to run. Existing AutoML solutions are, unfortunately, not directly applicable to time series anomaly detection, and no evaluation of time series-based approaches for model selection exists. Towards that direction, this paper studies the performance of time series classification methods used as model selection for anomaly detection. In total, we evaluate 234 model configurations derived from 16 base classifiers across more than 1980 time series, and we propose the first extensive experimental evaluation of time series classification as model selection for anomaly detection. Our results demonstrate that model selection methods outperform every single anomaly detection method while being in the same order of magnitude regarding execution time. This evaluation is the first step to demonstrate the accuracy and efficiency of time series classification algorithms for anomaly detection, and represents a strong baseline that can then be used to guide the model selection step in general AutoML pipelines. Preprint version of an article accepted at the VLDB Journal.

翻译：异常检测是时间序列分析中的一项基础任务，对众多下游应用的性能具有重要影响。尽管学术关注度日益提升，且文献中提出了大量方法，但近期基准测试与评估研究表明，在应用于高度异质的时间序列数据集时，并不存在普遍最优的异常检测方法。因此，针对来自不同领域、差异显著的时间序列数据，解决异常检测问题的唯一可扩展且可行的方案是提出一种模型选择方法，该方法将基于时间序列特征，选择最优的异常检测方法运行。遗憾的是，现有的自动化机器学习（AutoML）解决方案无法直接适用于时间序列异常检测，且目前尚无针对基于时间序列的模型选择方法的评估。为此，本文研究了将时间序列分类方法用作异常检测模型选择的性能。我们总计评估了来自16个基础分类器的234种模型配置，覆盖超过1980个时间序列，并首次对时间序列分类作为异常检测模型选择进行了广泛的实验评估。我们的结果表明，模型选择方法在保持与单一异常检测方法相近执行时间量级的同时，性能优于所有单一异常检测方法。该评估是证明时间序列分类算法在异常检测中准确性与效率的第一步，并为通用AutoML流程中的模型选择步骤提供了坚实的基线参考。本文为VLDB期刊录用文章的预印本。

相关内容

MoDELS

关注 44

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日