Each week, we select a recently published Open Access article to feature. This week’s article comes from Statistical Analysis and Data Mining and presents an efficient process for finding collective and point anomalies in temperature data sequences.
The article’s abstract is given below, with the full article available to read here.
A linear time method for the detection of collective and point anomalies, Stat. Anal. Data Min.: ASA Data Sci. J.. 15 (2022), 494– 508. https://doi.org/10.1002/sam.11586
, , ,The challenge of efficiently identifying anomalies in data sequences is an important statistical problem that now arises in many applications. Although there has been substantial work aimed at making statistical analyses robust to outliers, or point anomalies, there has been much less work on detecting anomalous segments, or collective anomalies, particularly in those settings where point anomalies might also occur. In this article, we introduce collective and point anomalies (CAPA), a computationally efficient approach that is suitable when collective anomalies are characterized by either a change in mean, variance, or both, and distinguishes them from point anomalies. Empirical results show that CAPA has close to linear computational cost as well as being more accurate at detecting and locating collective anomalies than other approaches. We demonstrate the utility of CAPA through its ability to detect exoplanets from light curve data from the Kepler telescope and its capacity to detect machine faults from temperature data.