Can we quantify the level of over-confidence in opinion polls?

Author: Dr John Fry

The story of scientific opinion polling is a litany of failed forecasts. From a United Kingdom perspective, recent episodes includes the recent 2017 general election and, perhaps most significantly, the 2016 referendum on the United Kingdom’s continued membership of the European Union – the so-called Brexit referendum.

Following a referendum held on June 23rd 2016, the United Kingdom voted in favour of leaving the European Union by a margin of 51.9% versus 48.1% of the final vote. Though there had been indications of recent increases in Euro scepticism, and many opinion polls did suggest a close-fought vote, in many ways a vote for Brexit was a shock. Opinion polls and data from bookmakers’ odds both pointed to a win for Bremain. Results from a recently published paper suggest that, collectively, bookmakers’ odds did not follow polling data sufficiently closely [1]. This stands in marked contrast with theoretical models of political prediction markets in [2] and suggests that this is fundamentally an empirical issue for bookmakers. In principle, bookmakers’ odds should give a fuller picture provided they adequately take into account all the available data.


Since the start of scientific polling, the accuracy of opinion polls has been questioned [3]. Social pressures mean survey respondents often over-report their likelihood of voting. Relatedly, survey respondents may not admit their true voting intentions if they think that their choice is not very popular. Uncomfortably, in the United States, this has led to the so-called Bradley Effect with polls typically over-estimating the level of support for non-white candidates. In the United Kingdom, this has led to the phenomenon of the “shy Tory”, with polls often under-estimating the true level of support for the Conservative Party. More recently, it is possible that similar Bradley effects might apply to Brexit (if people are embarrassed about admitting to voting for Brexit) or even to the 2017 General Election campaign (if the opinion polls in this case systematically over-estimated the true level of support for the incumbent Prime Minister Theresa May relative to the oft criticised leader of the opposition Jeremy Corbyn).

Here, we analyse the Brexit referendum. Statistically, the raw forecasting challenge is intriguing. Referenda are relatively simple compared to more complicated first-past-the-post elections. Referenda lack some of the tribalism of party politics and individual constituency-level effects associated with general elections in the United Kingdom.

In view of the above, why is it that opinion polls get things wrong? Do social pressures lead to polls over-pricing the probability of a vote in favour of Bremain? Can we therefore draw an analogy between speculative bubbles in financial markets and related effects in political systems?

Beyond a purely statistical perspective, the interplay between financial and political modelling is interesting, with financial modelling seemingly better established and relatively more advanced. Finance and politics both constitute classic examples of complex social systems so it is natural to ask if both areas can be well described by similar techniques [4-5].

Finance and politics are both incredibly complex areas cloaked in a certain degree of mythology and mysticism. A powerful social element underpins both settings. This includes psychological factors (e.g. over-confidence, the bias blind spot) and herding [1, 5]. Economic factors often assume central importance in political debates and the political science literature includes a range of sophisticated economic and statistical models. This includes works examining the relationship between political partisanship and economic conditions, political business cycle theories (whereby incumbent politicians try to manipulate the economy to maximise their chances of re-election) and studies of the relationship (in both directions) between stock prices and national elections. Combinations of opinion polls and economic data have also been used in previous academic work on political forecasting [6]. There is even a suggestion that the stock market has a greater impact upon US presidential elections than other economic variables [7].

Though politics and finance have much in common, a precise thread between them can be found via the notion of subjective probability [8]. In particular, data from opinion polls can be interpreted as representing the probability that a randomly chosen constituent will vote a certain way. If Pt denotes the proportion who vote for Bremain, this can be interpreted as the price at time t of a bet that pays £1 if a randomly chosen constituent votes for Bremain and 0 otherwise. 1-Pt has a similar interpretation. The ratio of the probabilities Pt/1-Pt can then be interpreted as the price of a bet that a randomly chosen constituent votes for Bremain in units of the price of a bet that a randomly chosen constituent votes for Brexit. We can thus test for a systematic over-pricing by looking for evidence of speculative bubbles in the odds ratio defined above [1].

In a recently published paper [1], we found evidence of speculative bubbles in opinion polls in the lead up to the Brexit referendum that are robust to the choice of estimation window. The overall estimate was that the odds ratios are 10% over-valued. It is estimated that the proportion who will vote for Bremain is 49.4% with an estimated probability of a vote for Brexit of 0.787. Statistically, speaking this means that the chances of a vote for Brexit are higher than the raw numbers would suggest.

These results also compare favourably with prediction intervals based upon a simple linear regression model [1]. This again suggests that naïve interpretation of raw polling figures is not very informative, without correcting for the Bradley effect, other than that they do suggest that the final vote is likely to be very close.

Finally, we note that the model in [1] also gives reasonable results when applied to the 2014 Scottish Independence Referendum. In particular, it appears that a vote in favour of Scottish independence is over-priced by between 6-20%. The final estimate is that 43.5% will vote for independence – much closer to the final figure of 44.7% than many pundits predicted. The clear indication is that such a model to correct polling data for the Bradley effect may have a broad range of possible practical applications.

In conclusion, the close analogies between financial and political systems can be elucidated by statistical modelling. Further, recent elections suggest that statistical models may have an increasingly important role to play in navigating volatile political landscapes on both sides of the Atlantic.

[1] Fry, J. and Brint, A. (2017) Bubbles, blind-spots and Brexit. Risks 37.
[2] Kou, S. G. and Sobel, M. E. (2004) Forecasting the vote; A theoretical comparison of election markets and public opinion polls. Political Analysis 12 277-95.
[3] Hyman, H. (1944) Do they tell the truth? Public Opinion Quarterly 557-59.
[4] Sornette, D. (2003) Why stock markets crash: Critical events in complex financial systems. Princeton University Press, Princeton.
[5] Prechter, R. R. jr (1999) The wave principle of human behaviour and the new science of socionomics. New classics library, Gainesville.
[6] Lewis-Beck, M. S., Nadeau, R. and Bélanger, E. (2016) The British general election: synthetic forecasts. Electoral Studies 41 264-68.
[7] Prechter, R. R. jr, Goel, D., Parker, W. D. and Lampert, M. (2012) Social mood, stock market performance and US presidential elections: a socionomic perspective on voting results. SAGE Open October-December1-13.
[8] Lad, F. (1996) Operational subjective statistical methods: a mathematical, philosophical and historical introduction. Wiley, New York.