Each week, we select a recently published Open Access article to feature. This week’s article comes from Stat from the 2020 Symposium for Data Science and Statistics special issue and considers self‐supervised learning for outlier detection.
The article’s abstract is given below, with the full article available to read here.
Self‐supervised learning for outlier detection. Stat. 2021; 10:e322. https://doi.org/10.1002/sta4.322, .
The identification of outliers is mainly based on unannotated data and therefore constitutes an unsupervised problem. The lack of a label leads to numerous challenges that do not occur or only occur to a lesser extent when using annotated data and supervised methods. In this paper, we focus on two of these challenges: the selection of hyperparameters and the selection of informative features. To this end, we propose a method to transform the unsupervised problem of outlier detection into a supervised problem. Benchmarking our approach against common outlier detection methods shows clear advantages of our method when many irrelevant features are present. Furthermore, the proposed approach also scores very well in the selection of hyperparameters, i.e., compared to methods with randomly selected hyperparameters.