Each week, we will be publishing layman’s abstracts of new articles from our prestigious portfolio of journals in statistics. The aim is to highlight the latest research to a broader audience in an accessible format.
The article featured today is from Statistics in Medicine, with the full article now available to read here.
Learning genetic and environmental graphical models from family data. Statistics in Medicine. 2020; 39: 2403– 2422. https://doi.org/10.1002/sim.8545
, .Many challenges in biomedical research rely on understanding the associations and causal relationships among complex traits and diseases, which are often influenced by multiple genes (polygene) and environmental factors. Structure learning algorithms have been extensively used for recovering, from observational data, the probabilistic graphical model expressing conditional independence relations among the variables. In contrast, family-based studies have proven to be useful in elucidating genetic and environmental factors affecting variables, since familial clustering may reveal genetic effects. In this work, the authors develop specific conditional independence tests for each genetic and residual components of the family data based on univariate polygenic linear mixed models. Then, they extend standard structure learning algorithms, including the IC/PC and RFCI algorithms, to recover from multivariate Gaussian family data the conditional independence structure and its decomposition into two components: one where associations cluster in families and hence may be explained by shared family genes, and the other where associations are explained by residual (environmental) factors. The proposed methods are evaluated by simulation studies and applied to the benchmark dataset from the Genetic Analysis Workshop 13, which captures significant features of the Framingham Heart Study. The algorithms are openly available as the FamilyBasedPGMs R package on GitHub.