Each week, we will be publishing layman’s abstracts of new articles from our prestigious portfolio of journals in statistics. The aim is to highlight the latest research to a broader audience in an accessible format.
The article featured today is from the Canadian Journal of Statistics, with the full article now available to read on Early View here.
Li, C., Hung, Y. and Xie, M. (2020), A sequential split‐and‐conquer approach for the analysis of big dependent data in computer experiments. Can J Statistics. doi:10.1002/cjs.11559
Massive correlated data with many inputs are often generated from computer simulations to study complex systems. The Gaussian process (GP) model is a widely used tool for the analysis of computer simulations. Although GPs provide a simple and effective approximation, two critical issues remain unresolved. One is the computational issue in estimation and prediction. For a large sample size and with a large number of variables, this task is often unstable or infeasible. The other is how to improve the commonly used predictive distribution which is known to underestimate the uncertainty. In this article, a unified framework is introduced that can tackle both issues simultaneously. It consists of a sequential split and conquer procedure, an information combining technique using confidence distributions (CD), and a frequentist predictive distribution based on the combined CD. It is shown that the proposed method maintains the same asymptotic efficiency as the conventional likelihood inference under mild conditions, but dramatically reduces the computation in both estimation and prediction. The proposed method is demonstrated by a real data example based on tens of thousands of computer experiments generated from a computational fluid dynamic simulator.