
2022). Improving image classification robustness using self-supervision. Stat, 11( 1), e455. https://doi.org/10.1002/sta4.455
, , & (Neural networks are one of the most powerful tools in the field of machine learning, but they cannot always be trained under perfect conditions as the large, high-quality and annotated data sets required for this are not available for every purpose. This is why robustness is a very important feature for neural networks. They should have the ability to decrease as little as possible in performance when facing increasingly difficult learning conditions. Small data sets pose a challenge for convolutional neural networks as these come with a huge number of trainable parameters and tend towards overfitting if data sets have too few samples. Imbalance within data sets induces a bias towards the majority classes which can alter the decision boundaries and result in the underrepresented classes barely being learned, although these are often the examples of primary interest. Missing or unrecognizable image parts can furthermore decrease classification accuracy significantly,even changes that are imperceptible to humans can be problematic. Self-supervised training is a promising strategy to improve the handling of these difficulties and make machine learning applicable in fields where large quantities of high quality data are hardly available. The same data set is used twice, but employed very differently in order to increase the information content that can be extracted. During the pre-training step, a comparatively simple task is solved by using pseudo-labels derived from the data itself, rendering training on unlabeled data sets possible and teaching the network valuable semantic knowledge. The weights of the networks are pre-trained this way, which enables a better classification during the actual downstream task. The pretext task has to be tailored according to the characteristics of the data set and the downstream task in order to generate significant improvements. Predicting the degree of rotation is a suitable option for clearly spatially structured data sets like MNIST (handwritten numbers) and Fashion-MNIST (articles of clothing). The publication “Improving image classification robustness using self-supervision” demonstrates that self-supervised learning significantly outperforms the standard approach when the learning difficulty is elevated, the difference in performance becomes greater with increased data manipulation. The classification accuracy for very small data sets can be improved by up to 12.5% for MNIST and 15.2% for Fashion-MNIST as well as up to 2.2% and 7.3% for input data with missing parts. In the case of very few training samples, self-supervision prevents that certain classes are not learned at all. The combination of data scarcity and imbalance even results in an accuracy increase of 42.9% due to self-supervision. Further, less significant improvements can also be achieved for imbalanced (up to 2.8% and 0.8%) and rotated datasets (up to 0.8% and 1.9%). Consequently, these results demonstrate the potential self-supervision has regarding robustification of neural networks.