Layman’s abstract for paper on conjectures on optimal nested generalized group testing algorithm

Each week, we will be publishing layman’s abstracts of new articles from our prestigious portfolio of journals in statistics. The aim is to highlight the latest research to a broader audience in an accessible format.

The article featured today is from Applied Stochastic Models in Business and Industry, with the full article now available to read in Early View here.

Malinovsky, Y. Conjectures on optimal nested generalized group testing algorithm. Appl Stochastic Models Bus Ind. 2020; 1– 8.

In the last few months of the COVID-19 pandemic, the mostly forgotten practice of group testing has been raised again in many countries as an efficient method for addressing an epidemic while facing restrictions of time and resources. During this very short period, numerous publications and reports have appeared in both scientific and non-scientific journals. The curious reader can easily see them, for example, in the Washington Post, NY Times, Scientific American, medRxiv, bioRxiv, ArXiv, and elsewhere.

The story goes back to 1943, when Robert Dorfman proposed a simple but very efficient strategy to administer syphilis testing (a blood test called the Wassermann test) to the millions of individuals drafted into the US army. Dorfman’s strategy can be demonstrated as follows: suppose that in a given population of size N = 999,999, any one randomly-chosen person has a probability p = 1/100 of having a disease, and the people are stochastically independent. Partition the population into 90,909 groups of size 11. In each group, 11 blood samples are pooled and tested together as one sample. If the test outcome is negative, then all group members are declared free from disease; therefore, instead of 11 individual tests, only one test is performed. Otherwise, if the test outcome is positive, then all members of the group are tested one by one; therefore, the total number of tests is 12. Direct calculations show that the expected number of tests here equals 195,571, versus 999,999 individual tests, which is an impressive saving of 80%.

It is possible to show that if p is below roughly 0.306, then Dorfman’s two-stage procedure is preferred over individual testing. In the 1960s and ’70s, Dorfman’s algorithm was improved in numerous publications, with particularly substantial contributions from Milton Sobel and Frank Hwang and their co-authors, among others. A fundamental result obtained in 1960 by Peter Ungar says that if p is above roughly 0.382, then individual testing is optimal, while individual testing is not optimal below this cut-off. However, sixty years later, the best procedure for the case with p below Ungar’s cut-off and general N remains unknown.

What is the story in this article? The story is easy to demonstrate using the current COVID-19 situation. While testing a population that consists of people from different areas, it cannot be expected that those people all have the same chance of disease, i.e. this is a situation where the disease prevalence is heterogeneous. Finding an optimal group testing configuration in the heterogeneous population is an extremely challenging problem, even for a given procedure. The issue is that there is a need to determine not only group sizes but also the members of the groups, and the number of such possibilities (number of the partition of the population) is astronomical even for the small population size. In fact, the optimal partition is known only for the Dorfman two-stages procedure (Frank Hwang, 1975).

When prevalence is homogeneous and pretty high (roughly above 0.293), but still below Ungar’s cut-off, there is a simple and efficient method, which was shown as optimal among nested algorithms by Frank Hwang and Yi-Ching Yao in 1990. It is not clear if this efficient method will work under similar assumptions while also applying the practical and realistic restriction of heterogeneous disease prevalence. The current work attempts to answer this question and provide a short survey of state-of-art adaptive group testing with heterogeneous prevalence.

More Details