Anomaly detection is a fundamental task for time series analytics with important implications for the downstream performance of many applications. Despite increasing academic interest and the large number of methods proposed in the literature, recent benchmarks and evaluation studies demonstrated that no overall best anomaly detection methods exist when applied to very heterogeneous time series datasets. Therefore, the only scalable and viable solution to solve anomaly detection over very different time series collected from diverse domains is to propose a model selection method that will select, based on time series characteristics, the best anomaly detection methods to run. Existing AutoML solutions are, unfortunately, not directly applicable to time series anomaly detection, and no evaluation of time series-based approaches for model selection exists. Towards that direction, this paper studies the performance of time series classification methods used as model selection for anomaly detection. In total, we evaluate 234 model configurations derived from 16 base classifiers across more than 1980 time series, and we propose the first extensive experimental evaluation of time series classification as model selection for anomaly detection. Our results demonstrate that model selection methods outperform every single anomaly detection method while being in the same order of magnitude regarding execution time. This evaluation is the first step to demonstrate the accuracy and efficiency of time series classification algorithms for anomaly detection, and represents a strong baseline that can then be used to guide the model selection step in general AutoML pipelines. Preprint version of an article accepted at the VLDB Journal.
翻译:异常检测是时间序列分析中的一项基础任务,对众多下游应用的性能具有重要影响。尽管学术关注度日益提升,且文献中提出了大量方法,但近期基准测试与评估研究表明,在应用于高度异质的时间序列数据集时,并不存在普遍最优的异常检测方法。因此,针对来自不同领域、差异显著的时间序列数据,解决异常检测问题的唯一可扩展且可行的方案是提出一种模型选择方法,该方法将基于时间序列特征,选择最优的异常检测方法运行。遗憾的是,现有的自动化机器学习(AutoML)解决方案无法直接适用于时间序列异常检测,且目前尚无针对基于时间序列的模型选择方法的评估。为此,本文研究了将时间序列分类方法用作异常检测模型选择的性能。我们总计评估了来自16个基础分类器的234种模型配置,覆盖超过1980个时间序列,并首次对时间序列分类作为异常检测模型选择进行了广泛的实验评估。我们的结果表明,模型选择方法在保持与单一异常检测方法相近执行时间量级的同时,性能优于所有单一异常检测方法。该评估是证明时间序列分类算法在异常检测中准确性与效率的第一步,并为通用AutoML流程中的模型选择步骤提供了坚实的基线参考。本文为VLDB期刊录用文章的预印本。