# Agent Based Models: Statistical Challenges and Opportunities

## Features

**Author:**Christopher Wikle**Date:**15 Sep 2014**Copyright:**Image appears courtesy of Wikipedia

Many branches of science are interested in discrete valued spatio-temporal processes. Typically such models are considered in discrete time but allow space to be continuous or discrete. In spatio-temporal statistics, it is most common to consider such models from a generalized linear mixed-model perspective (e.g., see Cressie and Wikle, 2011, Chapter 7). Such a “top-down” approach considers the spatio-temporal properties of the system in terms of a latent Gaussian spatio-temporal dynamical model. Alternatively, one may consider such processes as Markov random fields (MRFs) using one of the classes of spatio-temporal “auto” models (e.g., the spatio-temporal auto-logistic as described, for example, by Zhu et al. 2005). The MRF approach is a local specification where relationships between neighbors are specified conditionally buth have not typically considered complex multi-process and non-linear relationships. In general, it has proven quite difficult to efficiently parameterize realistic nonlinear, multi-process spatio-temporal dynamical processes from a statistical perspective (e.g., Wikle and Hooten, 2010) and there is a great need for such processes given these are the types of processes that we encounter in the real world.

Many scientists consider an alternative “bottom-up” modeling strategy for discrete-valued spatio-temporal dynamical processes that is agent-based. Such agent-based models (ABMs) are common in epidemiology and social sciences (e.g., Filatovat et al. 2013; Gilbert 2008; Keeling and Hohani 2008; Sattenspiel 2009), and are usually termed individual-based models in the ecological sciences (e.g., Grimm and Railsback, 2005), multi-agent models in engineering (e.g., Olfati-Saber, 2006), and sometimes cellular automata in the physical and mathematical sciences (e.g., Wolfram, 1984). The common link between these paradigms is that the system is characterized by autonomous agents (or individuals) that can take a discrete number of possible states and vary with time and space depending on a set of deterministic or probabilistic “rules.”

Because they are formulated from a bottom-up perspective, ABMs have the property that the characteristics or actions of the individual agents define ultimately the properties and behavior of their system. The critical component of such a modeling approach is that the autonomous agents interact with each other and with their environment. Much of the literature associated with these models makes an assumption that the agent interactions and evolution are governed by deterministic rules. From a statistical modeling perspective, the agent interactions are stochastic, and their probabilistic-based specifications are one of the important model challenges. That is, in probabilistic ABMs, the evolution of the agent's state is defined through a parametric probability distribution, and thus the system evolution and the behavior, as a whole, may not be unique (although the properties of the system are usually unique).

The critical component of such an Agent Based Model approach is that the autonomous agents interact with each other and with their environment

One distinction between agent-based modeling in statistics and in other disciplines is that statisticians define “agents” very broadly so that they can represent various scales, e.g., a virus, an individual animal, a node in a network, or a spatial unit (e.g., census block, state, etc.). Thus, in this case, much of the disciplinary distinctions between the various ABMs (individual-based models, multi-agent models, cellular automata) are lost. As mentioned previously, ABMs are usually considered from a discrete-time perspective. In this case, one must decide if all agents are updated at once during each simulation time step (i.e., synchronous updating) or so that as an agent is updated its new state is available to the other agents (i.e., asynchronous updating). This can make a difference in the global system properties (e.g., Caron-Lormier et al. 2008), but it is typically more computationally feasible to utilize synchronous updating.

Perhaps the primary scientific challenge associated with probabilistic ABMs is related to the specification of the parametric probability distributions that govern agent behavior. Clearly, statistical modelers play a role in this scientific challenge, but must also consider the associated parameter (or rule) learning, and critically, must account realistically for uncertainty in observations and parameters. Often the local and global properties of the system to are quite sensitive to parameter variability as a consequence of nonlinear relationships, sensitivity to initial conditions (i.e., chaos), and parameter redundancy. Statisticians also can play a significant role in developing appropriate measures for model selection in the ABM context. Given the individual nature of these models and the fact that there are often a very large number of autonomous agents, one must be cognizant of computational complexity when addressing these issues. Of course, it should be noted that computation is also often challenging in purely deterministic ABM implementations, especially if one is trying to obtain the rules of agent behavior empirically.

In recent years, it has been demonstrated that the principles of hierarchical statistical modeling can mitigate many of the aforementioned issues related to probabilistic ABMs (e.g., Hooten et al. 2010; Hooten and Wikle 2010). As summarized in Cressie and Wikle (2011), the hierarchical framework for spatio-temporal systems is ideal for accommodating the uncertainty associated with observations, process and parameters. Furthermore, a critical yet underutilized advantage of hierarchical models in ABMs is that parameters describing individual interactions with other agents and the environment can themselves be considered as stochastic processes with spatial and/or temporal variability (and may include exogenous covariate information about the environment.)

In summary, rule-based ABMs can be an effective modeling paradigm to account for complex spatio-temporal dynamics for discrete state spaces. With the addition of stochastic terms, and the presence of data, this becomes a statistical modeling challenge. The hierarchical modeling approach can facilitate such statistical modeling because of its strength in dealing with uncertainty in data, process and parameters. In addition, these models may be considered from a network perspective in some cases, which also allows one to borrow many existing network modeling methodologies. Although the use of complicated network models in the ABM context is relatively new, we expect to see additional developments in this area. In addition, there are potentially important issues associated with model selection in ABMs (e.g., Piou et al., 2009) and this is likely to be an important area of future research. Finally, approximating the simulation model through statistical surrogates (e.g., Currin et al. 1991; Dancik et al. 2010; Hooten et al. 2011; Parry et al. 2013), simplification of the statistical likelihood and/or approximate Bayesian computation (e.g., Sottoriva and Tavaré, 2010; Wood 2010), will likely be important components of future statistical implementations of ABMs.

**References**

Cressie, N. and Wikle, C. K. (2011) *Statistics for Spatio-Temporal Data*. John Wiley & Sons, Hoboken, NJ.

Caron-Lormier, G., Humphry, R. W., Bohan, D. A., Hawes, C., and Thorbek, P. (2008). Asynchronous and synchronous updating in individual-based models. *Ecological Modelling*, 212(3):522–527.

Currin, C., Mitchell, T., Morris, M., and Ylvisaker, D. (1991). Bayesian prediction of deterministic functions, with applications to the design and analysis of computer experiments. *Journal of the American Statistical Association*, 86(416):953–963.

Dancik, G. M., Jones, D. E., and Dorman, K. S. (2010). Parameter estimation and sensitivity analysis in an agent-based model of leishmania major infection. *Journal of Theoretical Biology*, 262(3):398–412.

Filatova, T., Verburg, P. H., Parker, D. C., and Stannard, C. A. (2013). Spatial agent-based models for socio-ecological systems: Challenges and prospects. *Environmental Modelling and Software*, 5:1-7.

Gilbert, N. (2008). *Agent-based Models. Number 153*. Sage, London.

Grimm, V. and Railsback, S. F. (2005). I*ndividual-Based Modeling and Ecology*. Princeton University Press, Princeton, NJ.

Hooten, M. B., Johnson, D. S., Hanks, E. M., and Lowry, J. H. (2010). Agent-based inference for animal movement and selection. *Journal of Agricultural, Biological, and Environmental Statistics*, 15(4):523–538.

Hooten, M. B., Leeds, W. B., Fiechter, J., and Wikle, C. K. (2011). Assessing first-order emulator inference for physical parameters in nonlinear mechanistic models. *Journal of Agricultural, Biological, and Environmental Statistics*, 16(4):475–494.

Hooten, M. B. and Wikle, C. K. (2010). Statistical agent-based models for discrete spatio-temporal systems. *Journal of the American Statistical Association*, 105(489):236–248.

Keeling, M. J. and Rohani, P. (2008). *Modeling Infectious Diseases in Humans and Animals*. Princeton University Press, Princeton, NJ.

Olfati-Saber, R. (2006). Flocking for multi-agent dynamic systems: *Algorithms and theory. IEEE Transactions on Automatic Control*, 51(3):401–420.

Parry, H. R., Topping, C. J., Kennedy, M. C., Boatman, N. D., & Murray, A. W. (2013). A Bayesian sensitivity analysis applied to an Agent-based model of bird population response to landscape change. *Environmental Modelling and Software*, 45, 104-115.

Piou, C., Berger, U., and Grimm, V. (2009). Proposing an information criterion for individual-based models developed in a pattern-oriented modelling framework. *Ecological Modelling*, 220(17):1957–1967.

Sattenspiel, L. (2009). *The Geographic Spread of Infectious Diseases: Models and Applications*. Princeton University Press, Princeton, NJ.

Sottoriva, A., and Tavaré, S. (2010). Integrating approximate Bayesian computation with complex agent-based models for cancer research. In *Proceedings of COMPSTAT'2010* (pp. 57-66). Physica-Verlag HD.

Wikle, C. K. and Hooten, M. B. (2010). A general science-based framework for dynamical spatio-temporal models. *Test*, 19(3):417–451.

Wolfram, S. (1984) Cellular automata as models for complexity. *Nature*, 311(5985), 419-424.

Wood, S. (2010) Statistical inference for noisy nonlinear ecological dynamic systems. *Nature*, 466, 1102-1104.

Zhu, J., Huang, H.-C., and Wu, J. (2005). Modeling spatial-temporal binary data using markov random fields. *Journal of Agricultural, Biological, and Environmental Statistics*, 10(2):212–225.

## Connect: