Each week, we publish lay abstracts of new articles from our prestigious portfolio of journals in statistics. The aim is to highlight the latest research to a broader audience in an accessible format.

The article featured today is from the Canadian Journal of Statistics with the full article now available to read here.

Sidrow, E., Heckman, N., Fortune, S.M.E., Trites, A.W., Murphy, I. and Auger-Méthé, M. (2022), Modelling multi-scale, state-switching functional data with hidden Markov models. Can J Statistics, 50: 327-356. https://doi.org/10.1002/cjs.11673

Many statistical inference techniques rely on the assumption that a set of observations are independent of one another. However, many data sets violate this assumption in practice, and some sequences of observations clearly display some kind of dependence structure. Hidden Markov models, or HMMs, are a common statistical technique used to model processes where some hidden state of interest changes over time and dictates the distribution of an associated observation. For example, in speech recognition the hidden state may be the word someone is pronouncing, and the observations may be a sound waves recorded from a microphone. Alternatively, in ecology the unobserved state may be the behavior of an animal (e.g. resting, traveling, hunting), and the observations may be a sequence of GPS coordinates over time.

Observations of a hidden Markov model are often assumed to be either scalars or vectors with some simple distribution. However, as data become increasingly finely sampled, these observations are often functions or high-frequency time series. In speech recognition the observation may be an entire sound wave corresponding to a particular word or syllable, and in ecology it may be the depth profile corresponding to a particular dive of a marine animal. These complicated observations are difficult to account for using traditional hidden Markov models.

One way to construct hidden Markov models when the observations are functions is to use a fine-scale model to account for the within-function variation and dependencies. The parameters of the within-function (or fine-scale) model are defined by the original between-function (or coarse-scale) HMM. If the fine-scale model is also an HMM, the resulting hierarchical model is a referred to as a hierarchical hidden Markov model.

However, issues can arise when using hidden Markov models to describe fine-scale behaviour in high-frequency data sets. Namely, hidden Markov models assume that observations are completely independent of one another if the hidden states are known, but this is usually not the case for high-frequency data. For example, even if an animal is known to be hunting, its position at a given time is likely correlated with its position one second ago.

The authors employ a two-stage approach to correct fine-scale hidden Markov models for high-frequency data. First, they utilize appropriate moving-window transformations, such as moving averages and short-time Fourier transforms, to summarize complicated fine-scale behaviours within short windows of time. Next, they explicitly model correlation between windows using an autoregressive process.

As a case study, they model the fine-scale kinematic movement of a northern resident killer whale (Orcinus orca) off the western coast of Canada. Through simulations, they show that their model produces more interpretable and accurate results compared to existing methods.

More Details