Probability and Statistics for Ecosystem Managers: An interview with author Dr. Timothy Haas

This summer, Wiley was proud to publish Introduction to Probability and Statistics for Ecosystem Managers: Simulation and Resampling by Dr. Timothy Haas, Associate Professor in Business Statistics at University of Milwaukee-Wisconsin, which explores computer-intensive probability and statistics for ecosystem management decision making

Simulation is an accessible way to explain probability and stochastic model behaviour to beginners. This book introduces probability and statistics to future and practicing ecosystem managers by providing a comprehensive treatment of these two areas. The author presents a self-contained introduction for individuals involved in monitoring, assessing, and managing ecosystems and features intuitive, simulation-based explanations of probabilistic and statistical concepts. Mathematical programming details are provided for estimating ecosystem model parameters with Minimum Distance, a robust and computer-intensive method.

The majority of examples illustrate how probability and statistics can be applied to ecosystem management challenges. There are over 50 exercises – making this book suitable for a lecture course in a natural resource and/or wildlife management department, or as the main text in a program of self-study.

Key features:

• Reviews different approaches to wildlife and ecosystem management and inference.
• Uses simulation as an accessible way to explain probability and stochastic model behaviour to beginners.
• Covers material from basic probability through to hierarchical Bayesian models and spatial/ spatio-temporal statistical inference.
• Provides detailed instructions for using R, along with complete R programs to recreate the output of the many examples presented.
• Provides an introduction to Geographic Information Systems (GIS) along with examples from Quantum GIS, a free GIS software package.
• A companion website featuring all R code and data used throughout the book.
• Solutions to all exercises are presented along with an online intelligent tutoring system that supports readers who are using the book for self-study.

Dr. Haas specializes in multivariate statistics, spatial statistics, decision-making using probabilistic networks, computer intensive statistical methods, time series, and ecosystem management. StatisticsViews talks to Dr. Haas about his new book and his career.

1. Congratulations on the publication of Introduction to Probability and Statistics for Ecosystem Managers: Simulation and Resampling which ‘introduces probability and statistics to future and practicing ecosystem managers by providing a comprehensive treatment of these two areas. The author presents a self-contained introduction for individuals involved in monitoring, assessing, and managing ecosystems and features intuitive, simulation-based explanations of probabilistic and statistical concepts. Mathematical programming details are provided for estimating ecosystem model parameters with Minimum Distance, a robust and computer-intensive method.’ How did the writing process begin? What was it that initiated the project?

I had lunch with Heather Kay, Assistant Editor, Statistics and Mathematics, Wiley in Oxford in May of 2011 as I was giving a talk at the Department of Geography at Oxford University. She asked me if there were any additional book topics I might be interested in pursuing. I replied that an introductory book might be challenging.

2. There are over 50 exercises within the book – making it book suitable for a lecture course in a natural resource and/or wildlife management department, or as the main text in a program of self-study. Was the inclusion of the exercises one of your main objectives during the writing process? What did you set out to achieve in reaching your readers?

I wanted a reader of my book to be able to learn the material independently. Working exercises is the best way to do this. The big thrust of the book was that individuals who are already employed would be able to form a self-education programme even though they could not get to a university and take courses. This way they could learn about probability and statistics to handle the increasingly complicated ecosystem models that people are recommending that be used.

There is a real pressure on people who are charged with wildlife protection or pollution or climate change modelling that they need to improve their understanding of probability models but they are in a situation when they cannot go back to university. I have learnt over many years that if you do exercises, you start to learn the material as opposed to just reading a book, passively.

There is a real pressure on people who are charged with wildlife protection or pollution or climate change modelling that they need to improve their understanding of probability models but they are in a situation when they cannot go back to university. I have learnt over many years that if you do exercises, you start to learn the material as opposed to just reading a book, passively.

There are also online tutorials which walk you through a particular problem and give you suggestions along the way when you stray. When you learn probability and statistics, it is not a linear activity – you hit road blocks and you need someone’s help so that you can move forward and that process generally requires a tutor. It is very difficult to study this material alone because when you come across something that you do not understand, you need someone to explain it to you so that you can move onto the next material. People who are working in agencies and have a lot of responsibility are being increasingly swamped by models that they do not understand, so this book was a small effort to try and address that. I looked into the online intelligent tutoring literature and I have tried my hand at developing some modules that help.

The exercises are the bedrock of the book – if you try the exercises and look at my solutions and then try them again, you will slowly start to acquire an understanding and a skill all on your own.

3. You also provide detailed instructions for using R, along with complete R programs to recreate the output of the many examples presented. What/who was it that introduced you to R in the first place and how important is its relevance within the book?

I’ve worked in S, Splus, and R for the past 24 years after being introduced to it by the Splus team at the dept of statistics, University of Washington during my visiting assistant professor job there during the 1989-90 academic year.

Since R is free, it is an accessible choice for cash-strapped ecosystem management agencies. As it has extensive capabilities and is programmable, it can perform most analyses required by these agencies.

I was actually going to use a program called SimFit and Wiley recommended that I instead use R. I was trying to find something that was easy to pick up early on. There is an issue with R in that it is a steep learning curve at the beginning, so I wrote about R Commander in the book but then I quickly move into doing everything script-based, which is how 95% of the world uses R. There is a challenge to a reader early on in the book where they have to move from a menu-based interface into using a script language. For people who have never done this, it is a challenge.

As R is so widespread, you really need to use it in an introductory text, unless you are tied to a commercial package like SAS or SPSS. I am trying to work with people who are in agencies that have very modest budgets. R does start to falter when you ask it to do very heavy computation and because of this, I also have written many JAVA programs that are very fast. I also take advantage of some optimization additions to R and explain in the book that you need to add these few lines to make R more efficient. At the end of the day, R has limitations when you try to do large memory or large processing activities, which is unfortunately the future of statistics. The R community is concerned about this and is struggling with this challenge. The cheap way around it is to use a lower level language like JAVA or FORTRAN but the problem with that is that you now have lost a large part of your user base. So many people struggle to write R programs and they are really overwhelmed when trying to write in a lower level language as well.

4. Were there areas that you found more challenging and if so, why?

A central goal of my book is to show how to build a model of a political-ecological system — and then use it to develop feasible and sustainable ecosystem management plans. In an introductory book, such a presentation needs to be accessible to readers with only a modest knowledge in probability and statistics. I hope my example of rhino poaching on private reserves in South Africa gets close to achieving this goal.

There is something of a message in my introductory book and it hails back to my first book in that ecosystem managers must start to consider the political aspects of their jobs. They just cannot hide their heads in the sand and say that this is just an ecological activity. People are impacting ecosystems and you have to understand what these people want and what actions they are capable of taking.

Traditional wildlife managers and environment protection administrators have somewhat paid lip service to worrying about what different stakeholders have in ecosystems. My two books are an effort to build formal models of how individuals that are in and around the ecosystem, behave and assess what their goals are. Such a goal is expressed in the introductory matter of the private reserves/rhino poaching example of the capstone chapter. Although it is a bit of a stretch, I hope that it is accessible to those who are just starting in probabilistic modelling.

5. Can you give us a taster of how a statistical technique can be applied to ecosystem management challenges?

One taster is as follows. You record through time the actions of rhino poachers, rhino preserve landowners, national police forces, and a nongovernmental agency focused on rhino conservation. Some of the actions you record are rhino-is-poached, poacher-is-shot, NGO-donates-money-for-more-rangers, police-double-antipoaching-patrols. You also observe rhino abundance on a few plots of ground at several different time points. You have a rhino population dynamics model and dynamic decision-making models for the poachers, police, and NGO. Use the actions data to statistically fit the parameters of this agent-based meta-model formed by the interactions of these group decision-making models and the rhino population model. Use the robust, frequentist estimation method of minimum simulated Hellinger distance. Make sure you have a pretty fast computer.

6. You have also written another book for Wiley, Improving Natural Resource Management: Ecological and Political Methods. What is it about the area of natural resources and ecosystems that fascinates you?

In my first book, I formally and completely layout my approach to first building a simulator of a political-ecological system (and then fitting it to data); followed by the computation of the Most Practical Ecosystem Management Plan (MPEMP) from this fitted simulator.

With tenure in the United States, it’s nearly impossible to fire an academic! So up to 1995, I was under tremendous pressure to produce a lot of papers and I managed to make the cut. I asked myself then what is the most important problem facing the planet. Biodiversity cannot be undone. If you lose a species, apart from speculative literature, there is very little hope in bringing that species back. Due to this, I think biodiversity trumps global warming and you are starting to see this in both literature and the media. I already had some experience in helping people manage ecosystems with acid rain, deposition and helping to manage land in the West, so I thought about starting a long-term project where we develop social and ecological models of at-risk ecosystems. I started by constructing models and statistical methods to estimate them as a research agenda, which accumulated into my first book. I have reviewed a book for Wiley that referenced some of my articles and, as I wrote up the review, the form included a question to the reviewer asking if I would be interested in writing a book. I eventually wrote a proposal.

7. What will be your next book-length undertaking?

I have had the very good fortune to have published two books with Wiley over the past three years. But now I need to produce some peer-reviewed articles.

After I do that, I would like to update my first book with decision-making models that detail more examples of finding sustainable ecosystem management plans, and examples of using large cluster computers to perform minimum simulated Hellinger distance statistical estimation.

8. With an educational background in engineering from California State University and statistics from Colorado State University, what was it that brought you to recognise statistics as a discipline in the first place?

Shortly after I earned my bachelor’s degree in Mechanical Engineering, I became interested in mathematical sociology. I enrolled in the Department of Sociology at the University of California – Santa Barbara and I supported myself with a part-time engineering job because intellectually, I decided that I wasn’t satisfied with being an engineer. There were few openings in sociology so I researched as to what was the closest field to mathematical sociology. I found statistics and thought a doctorate degree in statistics would enable me to find an academic job so that I could continue my research in group decision-making processes.

Six different graduate programs later, during which I increased my mathematical abilities, I got accepted at Colorado State University where I wrote a dissertation on a probabilistic model of President Kennedy’s ExComm advisory committee that he convened during the 1962 Cuban Missile Crisis. Professor Martha Cottam from University of Denver helped me as she was an expert as to how politicians reach decisions about international level issues. I then hit upon my approach as to how to model a person or group reaching a politically oriented decision.

Whilst still at Colorado, I was working as a research assistant in the Natural Resource Ecology Laboratory and I hence worked with many ecologists. The fields of decision-making and ecosystems then came together in my head. The real problem with managing ecosystems is that the managers involved have politically motivated goals and this needs to be understood better. I don’t consider myself a statistician, I consider myself more as a probabilistic modeller who uses statistics to validate the models that I develop.

The real problem with managing ecosystems is that the managers involved have politically motivated goals and this needs to be understood better. I don’t consider myself a statistician, I consider myself more as a probabilistic modeller who uses statistics to validate the models that I develop.

9. You are currently Associate Professor of Business Statistics at the University of Wisconsin-Milwaukee. Please could you tell us about teaching business statistics, and the fun and the challenges you get from teaching?

I set a high standard in my courses. Some of my students meet this challenge. I enjoy integrating both real-world examples and computing in my courses whether they be at the undergraduate, masters in business administration, or doctoral level.

With Big Data, the models that are being produced as to how to handle this data such as credit card transactions have really complicated algorithms, which sit on top of older algorithms. We have a really hard time teaching young people who have spotty backgrounds and are solemn, as to when you tell them about the standard that is expected.

I do find though that there are about 10% of students who really appreciate the fact that I tell them how things really are in statistics and push them. This small group tends to send me very kind emails afterwards saying that I was the most important professor during their student years.

10. What do you think the most important recent developments in the field have been? What do you think will be the most exciting and productive areas of teaching business statistics during the next few years?

The most important development in my opinion is the increasingly availability of massively parallel computing platforms. Statistical analysis that supports management decision-making needs to take full advantage of fast computation. Students need to acquire skills in translating mathematical algorithms into computer programs such as what one does when translating a new algorithm into an R code.

Computations are changing the probabilistic models that people are using to address a problem from an older, more classical family of models to more agent-based or semi-parametric models, which are computationally very expensive to fit. So, computation is changing not only the activity of statistical estimation but also the models that people choose to fit in the first place.

Therefore, the advent of much faster computing power, to me, is a sea change and a lot of people in statistics who are now in their 50s or 60s were trained before computers were fast so they are not ramping up a change in their curriculum to meet modern capabilities very well. It is better than it was ten years ago but we still do not use faster computers to their full potential.

11. Are there people or events that have been influential in your career?

One of my doctoral committee co-advisors, Professor Martha Cottam then at the University of Denver, Department of International Relations (1988) helped me form my approach to the probabilistic modelling of group decision-making.

I am afraid I cannot remember his name but the teacher of a Russian history class I took as an undergraduate who started me realising that I did not want to an engineer.