Place your bets but trust your stats

Author: Carlos Alberto Gómez Grajales

Some months ago, in January 2016, a world record was broken in the United States, a rather profitable one in fact. With a jackpot of over 1.4 billion dollars, Powerball, an American lottery game offered by 44 states in the country, became the world’s biggest jackpot ever [1]. The prize was split over three lucky winners, who were fortunate enough to overcome the elusive probability of 1 in 292.2 million chance of getting the winner ticket. Considering each of these winners received about 500 million dollars after taxes, Powerball seems like a worthy bet, isn’t it? Being such a lazy person, I haven’t worked up the numbers involving Powerball, numbers that should include not only the chances of winning the prize and the corresponding cash awards, but also the total numbers of tickets playing (as more tickets mean a larger chance of dividing the pool).

Luckily for us, some people have done it, concluding that, in the case of a huge jackpot of over a billion dollars, for every $2 invested in Powerball (the cheapest price for a ticket), you get in return $1.1355, thus producing an average lost of 86 cents for each ticket bought [2]. Thanks to mathematical analysis, we can regard the lottery as a bad gamble, or not a very profitable one at least. Believe it or not, this exact same question has been raised hundreds of times since the dawn of mankind, well, more exactly, since the dawn of casinos and gambling. With the promise of riches and well-being, much study and effort has been devoted to study gambling from a scientific point of view, sometimes reaching the boundaries of science and redefining mathematics itself. We, statisticians, have to be very grateful with the selfishness of that people who pursued their gambling interests.

You may argue that probability, one of the core concepts behind statistical theory, debts its existence to gambling alone. Liber de ludo aleae, the first mathematical treaty on probability, was the result of an Italian’s interest in gambling. Gerolamo Cardano, Italian physician with an avid interest in pursuing easy money, devised the first mathematical approaches to random games [3]. He analysed the total possible outcomes of dice rolling, resulting in what some view as the first ever approach to probability theory. Common chance terms like “sample space” originated from Cardano’s studies. But betting wasn’t solely Cardano’s concern, as many other famous mathematicians throughout history devoted their knowledge to create the foundations of statistics, thanks to not so noble interests like finding optimum wages and payment systems. Pierre de Fermat, Blaise Pascal and even Johannes Kepler and Galileo dedicated some time to understand how the gods of luck could be manipulated [4]. Richard Epstein, the renowned mathematician once declared: “Gamblers can rightly claim to be the godfathers of probability theory” [4].

One of the key concepts in both statistics and gambling is the expected value. The idea that would eventually be one of the basics of Galton’s regression model was born out of gambling necessity: How would you divide the money if one of the gamblers decides to bail out when he’s approaching a loss? This problem, which dates back to medieval times and that is usually known as the “Problem of points” became the corner stone of probability theory. In the seventeenth century, Antoine Gombaud, French author and amateur mathematician, invited a group of bright minds to work on the unsolved puzzle, among them two notorious mathematicians: Pierre de Fermat and Blaise Pascal [3] [4]. By working on the solution of the problem, Pascal unintentionally came to first define what later would be known as “expected Value” a pillar term both in statistics and gambling in general.

In gambling, the expected value can be regarded as an average as it is the mean of the profits / losses you would receive after making the exact same wage over an infinite number of times. If you encounter a wage that rewards you a $100 with a 20% chance of winning, while paying $40 per game, you can expect to win $100 one out of five times, while losing $40 in four out of five. That means that your results would be $100, -$40, -$40, -$40, -$40. In average, you’d be losing $12 per game. As you can see, this game is a lousy bet, taking away from you $12 each time.

Considering that gamblers brought to life the ideas behind probability theory, it is hardly shocking to learn that betting companies today still rely on many, some of them ancient, probability concepts. One of these is the concept of “odds”. Let’s suppose the odds of anyone liking this article are 1 to 10 (I’m aware I’m not a very good writer). These odds are usually expressed in the form (1:10), which reads “One to ten”. What this means in plain English is that for every 11 persons that commit to read this article, 1 of them will like it and 10 of them will close their browsers in anger.

Odds are usually the most common expression of the likelihood of an event used for betting and everyone familiar with gambling should be fairly accustomed to them. Consider a major sports event, like the Euro 2016 final. Germany started with odds of 2 to 9 (2:9) of winning the tournament, which meant that, at least for the people who calculated it (more on that later), if the Euro 2016 was to be played 11 times, the German team would outright win in 2 of these cups [6]. Compare that to the odds Portugal had, which were of 1 to 20, meaning that Cristiano Ronaldo’s team was expected to win one time out of 21 tournaments played. Notice how easy it is to convert odds to probabilities, something we have just been doing. With odds of 1 to 20, the probability of Portugal winning the Euro 2016 was estimated to be of 1⁄21=0.0476 . Germany’s probability of lifting the cup was estimated at 2⁄11=0.1818 . The opposite is also true: once you have estimated the probability of an event, it is fairly simple to estimate the odds, which are defined as “Probability of the event happening / Probability of the event NOT happening”.

Casinos and betting sites don’t usually present you the odds of an event happening anymore. Nowadays, they show you the “betting Line”. The betting line is another way of expressing the odds, though showing it in the universal and easier to understand language of money. According to some sports sites, the betting lines for the Euro 2016 final were (France: +105, Portugal: +350). Betting lines are usually expressed using a baseline of $100. What these numbers mean is that by betting $100 on France you could have won $105 and by betting on Portugal you could have gained $350. These numbers can also be easily translated to odds. Portugal had odds of winning of 100 to 350 and France had odds of winning of 100 to 105.

Well, that’s not exactly true. There are a couple of tricks that betting companies use to make sure they win no matter what. One of them is to charge a service fee or commission on the winnings. This commission varies, yet it usually lies in the range of 20% of your winnings [8]. What’s worse, this commission is usually hidden between the betting lines. Portugal doesn’t have odds of 100 to 350, they probably have odds closer to 100 in 370, but those $20 missing in the betting line is reduced due to the “Commission Fee”. Without that fee, casinos and sports site would be in trouble, as that amount is what allows them to produce a profit.

Companies like Cantor Gaming, one of the most important bookmakers in the United States, regularly uses huge datasets to understand what events during a game affect the gamblers actions. Bookmakers don’t try to predict who will win, they try to predict whom will gamblers bet on. Their job is to balance the books to make a profit, tweaking the published odds to make sure that the house wins no matter what.

Unlike what you have probably imagined, betting lines and bookmaker odds are in no way predicting mechanisms. The reason is that, unlike gamblers, casinos and bookmakers have no interest in predicting who will win a match in any particular sport, all they care is to make money [4]. Let me briefly explain how the whole betting scheme works. Suppose a soccer match is perfectly even, with the two teams having a 50% chance of winning. In this scenario, a fair bet of $10 would give a return of exactly $10. There are obviously no fair bets in Las Vegas or anywhere else, so a $10 bet would certainly give a return of about $9.50, due to bookmakers’ commission. Welcome to the real world: if you bet the same amount on both teams, you’d end up losing 50 cents (and this assumes a rather nice and cheap bookmaker commission). As you can see, in the case of both teams getting a similar amount of bets, the house wins. But in case a disproportionate amount of money is waged on a single team, the house risk of losing increases.

Let’s say that, for some reason (perhaps the goalkeeper appeared wearing his lucky colour), bettors decide to pile their money on a single team, so every single bet is placed on this team and not a single dime is waged on the other team (I mean, the goalkeeper wore black, who could bet on that?). Remember, the teams are perfectly equal, so there’s a good 50% chance either team wins, no matter what gamblers believe. In this case, there’s a 50% chance the casino has to pay to all gamblers who placed a bet and lose a considerable amount of money. Since casinos aren’t really prone to losing, the betting lines, i.e. the odds for the match, are adjusted accordingly to what people are gambling. Gambling companies regularly update their betting lines according to what wagers are saying, so it becomes more expensive to keep piling bets on the same team and more and more rewarding to bet on the underdog. This will balance the books eventually, which would allow the bookmakers to make healthy profits. Now, this doesn’t mean bookmakers are not relying on complex statistical models. I mean, duh, Big Data, people. Companies like Cantor Gaming, one of the most important bookmakers in the United States, regularly uses huge datasets to understand what events during a game affect the gamblers actions. Bookmakers don’t try to predict who will win, they try to predict whom will gamblers bet on. Their job is to balance the books to make a profit, tweaking the published odds to make sure that the house wins no matter what.

But even if bookmakers aren’t working to predict sports and race outcomes, don’t believe that a lot of people aren’t already out there throwing money at statistical models that forecast sports results. I already discussed how people have been using statistics to earn ridiculous amounts of money, either by counting cards or beating Wall Street [9]. So you can imagine that there are millions invested in using the most sophisticated statistical methods to predict sports outcomes.

Mark J. Dixon and Stuart Coles have a rather interesting story to tell: few people can claim to have created a whole industry with a single paper. “Modelling Association Football Scores and Inefficiencies in the Football Betting Market”, published in 1997, would become the first highly publicized effort to scientifically predict football results [10]. Researchers used maximum likelihood estimation, based on a Poisson regression model to forecast the number of goals scored per team. Though the model was rather simple, the study reported positive returns when compared to bookkeeper’s predictions: it was good for betting. For rather obvious reasons, just after they published the paper both researchers left their academic careers in pursuit of more profitable endeavours. Dixon set up Atass Sports, a British consulting firm specialized in using statistical analysis to predict sports outcomes [11]. Coles, on the other hand, joined Smartodds, another consulting firm that sells gambling tips to professionals. They now base their predictions on sophisticated statistical models that originated from their original work [4]. These two companies are just some of the many major endeavours that have originated in Europe, Asia and America, all of them dedicated to the task of giving an edge to gamblers who wish to outsmart the house. And football wasn’t even among the first sports to be outsmarted by statistics.

Long before Dixon and Coles even showed interest in modeling sports, Bill Benter had already spent a considerable amount of time studying bets [4]. He started with blackjack, after having read Edward Thorp’s Beat the Dealer, the famous book in which he published his mathematically optimal blackjack strategy [9]. It proved to be a worthy reading, as after a while, Benter was making about $80,000 a year from the game. He coupled with Alan Woods, an actuary with a shared interest in making money out of Las Vegas, in order to keep exploiting the loopholes in the card game. Sadly, after a while blackjack turned out to be less and less profitable: casinos became good at spotting card counters and it didn’t take long before the pair were banned from most important casinos around the globe. They took their attention somewhere else, some 12,000 kilometers away from Las Vegas in fact: the Hong Kong race tracks. By using regression models, Benter and Woods devised strategies to predict races, taking into account several factors such as the weight of the horse, how many races it had participated in before, and the experience of the jockey, among others. Their prediction was then combined with the racing odds, thus considering the predictions of other gamblers as well. Their group was able to produce a $100,000 profit on their very first year and the endeavour kept considerable margins even when Benter and Woods parted ways. In 1993, Benter published the basis of one of his models, so anyone interested in making money from horse tracks should give it a read [12]. More than twenty years later, statistical models have become the norm for horse race betting. Syndicates and companies alike employ statisticians and mathematicians everyday to ensure their models have the edge among the several companies competing with them.

Just as gambling became the cornerstone of probability origins, probability and statistics are now paying the favour back, providing gamblers with the latest and most advanced tools to earn an edge on the house. On reality, the task is fairly complex, as bookmakers and casinos are also integrating complex data analysis to keep up with the advances of professional gambling syndicates. In consequence, the gambling market has evolved recently, to become a rather profitable endeavour for some and a harsh reality for others. Nowadays, as it happens in more and more industries around the globe, not using the latest statistics can significantly diminish your chances of winning. So be sure to bet on statistics.


[1] Powerball’s already massive jackpot rises to $1.4 billion. Chicago Tribune (January 11, 2016)

[2] Butler, Bill Durango Bill’s Applied Mathematics – Powerball Odds

[3] Kucharski, Adam Seven lucky ways that gambling changed maths. The Guardian (May, 2016)

[4] Kucharski, Adam, The Perfect Bet: How Science and Maths are Taking the Luck Out of Gambling. Profile Books; Main edition (5 May 2016)

[5] Rothman, Stanley Sandlot Stats: Learning Statistics with Baseball. Johns Hopkins University Press (September 17, 2012)

[6] Knight, Chris Germany Tipped For Glory In Our Euro 2016 Betting Preview. Betting Directory (June 6, 2016)

[7] Portugal vs. France odds, Euro 2016 final: France betting favorite in championship game. SB Nation Website (Jul 9, 2016)

[8] How to Calculate Betting Commission. Betting Expert (April, 2016)

[9] Grajales, Carlos Alberto Can statistics make you rich? Statistics Views website (May, 2015)

[10] Dixon, Mark & Coles, Stuart Modelling Association Football Scores and Inefficiencies in the Football Betting Market. Applied statistics, Volume 46, Issue 2 (1997) 265-280.

[11] Who We Are. Atass Sports Website

[12] Benter, B. Computer based horse race handicapping and wagering systems: a report. Operations Research Society of America Conference (1993) Phoenix, AZ.