Every few days, we will be publishing layman’s abstracts of new articles from our prestigious portfolio of journals in statistics. The aim is to highlight the latest research to a broader audience in an accessible format.
The Open Access article featured today is from Environmetrics: ‘Nonparametric statistical downscaling for the fusion of data of different spatiotemporal support’ by C. J. Wilkie, C. A. Miller, E. M. Scott, R. A. O’Donnell, P. D. Hunter, E. Spyrakos and A. N. Tyler, published in issue 30.3.
Nonparametric statistical downscaling for the fusion of data of different spatiotemporal support. Environmetrics. 2019; 30:e2549. https://doi.org/10.1002/env.2549
, , , et al.The full article is available to read online here.
How can satellite data be fused with lake samples to investigate lake water quality?
Lakes are vitally important to the health of planet Earth. They are unique ecosystems, directly sustaining freshwater plant and animal species and indirectly sustaining human and animal populations. Comprehensive understanding of how the lake water quality is changing over space and time is of vital importance, but traditionally in-lake monitoring data of, for example, chlorophyll-a, often are collected at only a few in-lake locations infrequently over time. However, satellites can observe the light reflectance of the surface of lakes, which can be transformed through algorithms to chlorophyll-a concentrations, providing an additional data resource which sees the whole lake and observes it frequently.
An important question is how the in-lake and satellite data sources can be combined. The in-lake data, assumed to be accurate within measurement error, are collected at point locations and at irregular points in time, while the remote sensing data are available on a spatial grid at potentially different points in time. In the paper, the authors propose a novel nonparametric statistical downscaling model to combine the data sources. This is a regression model that accounts for the different temporal frequencies between the two datasets by representing each time series of data as observations of smooth curves over time, while accounting for potential spatial differences between the two datasets by allowing the regression model coefficients to vary smoothly over space. The method allows investigators to produce calibrated spatial maps (and associated uncertainties) of the variable of interest across each lake, incorporating the spatial information from the remote sensing data, while ensuring accuracy through fusion with the in-lake data. As well, a time series of predictions (with associated uncertainties) can be produced at any location within a lake. This approach enables integration of disparate data sources to enable spatial and temporal prediction of water quality condition and associated uncertainties to inform future water quality management. The method is potentially applicable to a wide range of datasets beyond lake data, wherever data of different spatial and temporal support are to be fused.
Image copyright: Patrick Rhodes