  • Author: Carlos Alberto Gómez Grajales
  • Date: 10 Aug 2015
A colleague recently shared with me an opinion piece published in a national newspaper here in Mexico. The article was entitled “If our surveyors were pilots, we would all be dead” [1]. What a way to end my coffee break.

To be honest, complaining about the results of opinion polls and surveys is not something new, nor anything exclusive to Mexico. This episode immediately brought to my mind the backlash that surrounded all surveys and forecasts that were done during U.K.'s General Election, just about two months ago. Take a look at the #pollgate Hashtag on Twitter if you don't know what I'm talking about. Public opinion surveys seem to have fallen in the public disgrace all around the globe, due to its inaccuracy and what some seem to describe as flawed methods. But I'm not here to defend polling. Many colleagues are working on it. I'm here to defend statistics.

What many journalists and commentators ignore is that these surveys are based on methodological procedures commonly used on surveys that form the basis of many other disciplines. In fact, many decisions that affect the course of a country are based on survey estimations. The most common economic indicators, those that the set the course for the commander in chief, actually come from surveys, not quite different to the ones the media has been complaining about.

Economics is the study of consumption, production and distribution of goods and services, the scientific study of wealth: how it is created and distributed. It is particularly interesting for governments, since economic policies are often a major consideration of voters, thus a concern for both presidents and prime ministers. The economic strategy that a government follows is closely dictated by the current situation in the country. But how can they know what's exactly going on at a macroeconomic level? There are many macroeconomic indicators that are periodically estimated with surveys. We will talk about two of the most prominent ones: inflation and unemployment.

The inflation rate is one of the most interesting and sophisticated uses of statistical analysis in economics nowadays. It is a measure of the average change over time in the prices of consumer items and it is important because it measures how many products the currency can acquire [2]. Controlling inflation is one of the major roles of Central Banks so it is crucial to obtain accurate and reliable estimates. For that, each country has developed its own inflation index, with some minor regional disparities. Two well documented ones are the Consumer Price Index (CPI), estimated by the Bureau of Labor Statistics in the United States and the Harmonised Index of Consumer Prices (HICP) calculated by Eurosystem. They involve a remarkable set of survey design, sampling and statistical inference. The Bureau of Labor Statistics sums it up this way:

“The CPI is a complex measure that combines economic theory with sampling and other statistical techniques and uses data from several surveys to produce a timely and precise measure of average price change for the consumption sector of the American economy” [2].

American CPI uses a multi-stage sampling selection, commonly used in many other surveys. The procedure involves a selection of clusters of geographic regions, stratified based on their population size, whether they form a metropolitan area and based on census regions [2]. On each PSU, a sample of outlets and items is taken, so those measures are included in the calculation. The HICP calculation involves a similar form of sampling, including regional stratification [3].

It is important to note that the methodology commonly used by official agencies is not dissimilar to the one used by polling agencies (and in my experience, some pollsters even do better). The main difference is sample size, which is directly related with cost.

After selecting the regions and outlets, it is important to choose which prices to measure. Not all prices are priced monthly, typically only those with more volatile pricing are measured. The choice of the items, products and services included in the calculation of the index is not trivial. Different indices are often calculated by the same agency, each including a different set of items. The INPC (Índice Nacional de Precios al Consumidor), the index calculated by the official Statistics Agency in Mexico, calculates a mid-term tendency index, called “Inflación Subyacente” (Subj-accent Inflation) [4]. This number uses a similar calculation but excludes the most volatile prices of the economy, usually crops and food related items. This numbers gives a better overview of the price changes caused by macroeconomic variables.

Each agency has its own set of items, usually in the tens of thousands, which are commonly measured. The items vary due to cultural differences and purchasing behaviour. For instance, the Harmonised Index of Consumer Prices (HICP) calculated by Eurosystem, includes items such as food, newspapers, petrol, hairdressing and insurance. The prices of around 700 products are collected every month, in different outlets and in approximately 1,600 towns across the euro area. That makes it around 1.8 million price observations every month [5]. These prices include both taxes and possible price reductions or sales. The American CPI includes food and beverages (both at home and away from home), housing, medical care, recreation expenses, education and communication ones, amongst many other expenses measured [2].

As the BLS described, measuring inflation involves not only survey analysis, but also more technical statistical input, which is worth talking about. Each of the different items whose price is measured has to be weighted for the calculation. Some products are more important for the average family of a country than others. If somewhere, somehow the price of a Ferrari suddenly doubles, most of the population won't even notice, even though the price increase was steep. On the other hand, if the price of eggs goes up, most of the country will complain. This is true everywhere in the world. For this, each of the items included in the calculation of an inflation rate has to be weighted, a weight that is usually related to the share of the total expenditure that the item has, among all households [3]. That means that the products people buy more, like beer, usually have a bigger weight when computing the index that some less popular products, like soy milk. For instance, the Mexican index includes two categories for tortillas, with one of them having one of the biggest weights in the index, higher than that of sodas, chicken and public transportation [6].

The use of weights is a common practice in survey analysis. Economic, political and social studies commonly use weights to balance the importance of the observations either by demographic profiles or by some measure of importance. An inflation index is an estimation of a weighted average of price changes. That means that the estimation adjusts the importance of each of the items and products included in the calculation. It is important to notice that most inflation rates aren't estimated via arithmetic means: most of them use a geometric mean instead. The reason for the choice is based on economic theory: it is thought that the geometric mean captures the behaviour of consumers who buy less of a product when it gets dearer; yet the common arithmetic mean fails to take this into account [7].

Inflation isn't the only interesting number that economists estimate with surveys. The Current Population Survey (CPS) is a monthly survey of households conducted by the Bureau of Census for the Bureau of Labor Statistics. It provides a comprehensive body of data on the labor force: employment, unemployment, persons not in the labor force, hours of work, earnings, etc. [8].

Unemployment is very interesting in itself. The American BLS defines unemployment as Persons who do not have a job, have actively looked for work in the prior 4 weeks, and are currently available for work [9]. Though each country has its own definition, they are somewhat similar. Due to the impact that unemployment can have on a country (families lose wages, the country loses production and the purchasing power of the workers is lost) this is another indicator that is estimated periodically with surveys. The same survey can also help identify the demographics of those more affected and even the regions of the country where the impact would be higher.

In the Current Population Survey, there are about 60,000 eligible households in the sample which roughly means approximately 110,000 individuals being interviewed each month [9]. The labour force survey, the European equivalent, is the largest European household sample survey, performing around 1.8 million interviews in the 33 countries it covers [10]. The interviews are usually done face to face or by phone. Survey respondents are never asked specifically if they are unemployed, nor are they given an opportunity to decide their own labour force status. Their status will be determined based on how they respond to a specific set of questions about their recent activities [9].

As you can see, survey methodology is commonly used in economics, to draw a reliable frame of the current economic situation of a region. It is important to note that the methodology commonly used by official agencies is not dissimilar to the one used by polling agencies (and in my experience, some pollsters even do better). The main difference is sample size, which is directly related with cost. Most pollsters work with samples of around 2000 interviews, while official agencies have the budget and resources to reach hundreds of thousands. It is not that poll companies cannot interview more people - it simply means that those buying opinion surveys do not have the budget to do so. As a result, margin of error in opinion polls tend to be way higher than what you can find in most official surveys. That is also the reason why pollsters and analysts try to add models and stuff to make more accurate estimates, trying to get the most from their limited budgets [11]. So do not immediately dispose surveys because of some occasional missteps: an economic survey is a powerful scientific tool that allows us to understand our world and are also forms the basis of some of the most important economic decisions a government makes.

