Open Access: Using text mining to track outbreak trends in global surveillance of emerging diseases: ProMED-mail

Each week, we select a recently published Open Access article to feature. This week’s article comes from the Journal of the Royal Statistical Society Series A and shows the potential for ProMED to be used in monitoring responses to outbreaks of infectious diseases. 

The article’s abstract is given below, with the full article available to read here.

You, J.Expert, P. & Costelloe, C. (2021Using text mining to track outbreak trends in global surveillance of emerging diseases: ProMED-mailJournal of the Royal Statistical Society: Series A (Statistics in Society)001– 15
ProMED-mail (Program for Monitoring Emerging Disease) is an international disease outbreak monitoring and early warning system. Every year, users contribute thousands of reports that include reference to infectious diseases and toxins. However, due to the uneven distribution of the reports for each disease, traditional statistics-based text mining techniques, represented by term frequency-related algorithm, are not suitable. Thus, we conducted a study in three steps (i) report filtering, (ii) keyword extraction from reports and finally (iii) word co-occurrence network analysis to fill the gap between ProMED and its utilization. The keyword extraction was performed with the TextRank algorithm, keywords co-occurrence networks were then produced using the top keywords from each document and multiple network centrality measures were computed to analyse the co-occurrence networks. We used two major outbreaks in recent years, Ebola, 2014 and Zika 2015, as cases to illustrate and validate the process. We found that the extracted information structures are consistent with World Health Organisation description of the timeline and phases of the epidemics. Our research presents a pipeline that can extract and organize the information to characterize the evolution of epidemic outbreaks. It also highlights the potential for ProMED to be utilized in monitoring, evaluating and improving responses to outbreaks.
