The statistics behind your Amazon recommendations

Author: Carlos Grajales

The air is filled with love again. We are at that lovely part of the year, the one and only Valentine’s season. No one can deny the atmosphere is already filling with the holiday’s sensation: decorations, events, T.V. shows and the frenzy shopping for our loved ones. And what a shopping it is. In the United States alone, people spent a much respected $19.7 billion on Valentine’s Day during 2016 (Setting a new record, by the way) [1]. On average, each person spent about $146 that day. As you’d imagine, unmarried couples make the most effort, spending most on each other, yet more than half of all Valentine’s Day consumers are single. They usually spend half of what the average non-single does, but this proves that the friendship part of Valentine’s still drives a lot of consuming. And finally, on a shocking revelation, men usually spend twice as much as women. As a firm proponent of gender equality, I invite all fellow men to reduce their gifts spending by half this year.


With the amount of money spent on this particular day, you can easily guess how important the season is for retailers. In fact, all special holiday shopping is crucial for the economy. More than 24% of all retail sales of the year in the United States can be traced to the Christmas / end of year holiday season alone for some sectors, such as Jewelry or Departmental stores [2]. Curiously, the record breaking amount of money spent during Christmas and Valentine’s Day around the world isn’t necessarily translating in more people going to the stores. In fact, some consulting companies expected brick-and-mortar store traffic to be down 11 percent in November and 5 percent in December this past year, as compared to the same months of 2015 [2]. Why? Simply cause online retail is growing big. Each year more and more people are shopping online and not only because of the convenience of online shopping, but due to the intricate marketing and attraction methods developed by online retailers, methods that owe must of its fundamentals to the smart use of statistics.

Over the past decade, e-commerce has grown at a very healthy rate of almost 18 percent a year [3]. As of 2013, it already accounted for an estimated 8 percent of total retail sales. And the larger share of that basket belongs undoubtedly to Amazon, one of the largest online retailers on the planet. You can imagine Amazon’s success is not some random anomaly. They are one of the most analytical companies around and nothing emphasizes more their data driven approach than their recommendation algorithms.

Amazon employs many forms of recommendations to engage the user, increasing average value order and inviting them to acquire the latest (and certainly more profitable) items [4]. Their recommendation system is based on a number of simple elements: what a user has bought in the past, which items they have in their virtual shopping cart, the items they’ve rated and liked, and what other customers like them have viewed and purchased [5]. The result is a highly personalized shopping browser: a gadget enthusiast may find Amazon pages heavy on device suggestions, while a driving lover could see those same pages offering up car products.

Recommendation algorithms are quite popular and in no way exclusive to online retailers. In its roots, what a recommendation algorithm does is to quantify how similar customers are, based on their history of purchased or liked items and by some other metrics, such as demographic or geographic characteristics [5] [6] [7]. Once the model has identified the user or the set of users who are more like you, the algorithm aggregates items bought by these similar customers, eliminates those you might have already purchased or somehow noticed, and recommends you the remaining items. Anyone with proper knowledge of multivariate statistics should already be thinking on the kind of metrics these algorithms use. Due to their reliance on (dis)similarity measures, recommendation algorithms are closely related with clustering methods. In fact, some very popular dissimilarity measures used for clustering, such as the cosine measure (angular separation between two vectors) or Jaccard’s binary coefficient are quiet popular for recommendation algorithms [6] [7]. By the way, did you know that Jaccard’s coefficient wasn´t actually discovered by Paul Jaccard himself? It was originally developed by Grove Karl Gilbert [8]. In 1884, Gilbert wrote an essay criticizing a new forecasting method to predict tornados, developed by J.P. Finley, a military meteorologist who proposed a simple forecasting algorithm, with “extremely high” accuracy numbers [9]. Gilbert doubted such accuracy and developed an “index of prediction”, as a more appropriate metric of the success of Finley’s forecasting method. Finley’s forecasts turned out to be really bad according to Gilbert’s calculations [10]. This index was the first version of the Jaccard coefficient. And after this historical note, we can return to our usual programming.

Even though Amazon’s algorithm claims to be extremely origin (they even developed their own homegrown math named “item-to-item collaborative filtering”) in reality it is a form of clustering in which, instead of clustering customers, they cluster items or products [7]. Amazon associates the many different products they sell based solely on which ones they have sold together, i.e., which items are usually bought by the same person. As such, instead of clustering customers, they calculate a similarity index between products and, as a customer purchases, rates or adds a product in his/her basket, the algorithm recommends highly similar items. Calculating a dissimilarity matrix for all the millions of items available on Amazon is quiet time consuming, so this calculation is done offline. The dissimilarity matrix is uploaded as an input for the online recommendation algorithm, which uses that information to produce real time recommendations.

Even though Amazon’s algorithms are extremely successful, even claiming some conversion rates of about 60%, they are in no way perfect algorithms, even criticized by some due to inaccurate suggestions [5]. In fact, some other companies may be producing better high performance recommendation models, apparently even surpassing Amazon’s accuracies.

Netflix is another company popular for their analytics approach to marketing and user engagement. They are known for using A/B tests (randomized experimental designs used for marketing purposes) as a means to improve their user interface, layout and even whether the image that appears at the top of the screen when you hover over a title should be static, or a carousel. A rotation of three images worked best, according to their tests [11].

Netflix takes prediction algorithms so seriously that they launched a 2006 contest to invite the public to find the best prediction algorithm, one that could outsmart the one they used back then [12]. They offered a $1 million prize to whoever improved the accuracy of their then-current system, as measured by the Root Mean Squared Error (RMSE), by at least 10%. The two winning algorithms have familiar names for statisticians: Singular Value Decomposition and Neural Networks (through Restricted Boltzmann Machines) [12]. Nowadays, Netflix still applies a form of Collaborating Filtering (the similarity based algorithm employed also by Amazon) but with a statistical model attached to improve their accuracy.

This model, the most notorious one within Netflix, is the movie rating prediction, a model which aims to forecast the rating a user would give to an unseen movie. But this is hardly the only recommendation algorithm that Netflix currently uses. Besides predicting movie ratings, their internal algorithms also personalize a user’s screen, by showing suggested genres, movies and series [12]. Something interesting is that these algorithms account not only for a user’s taste, but also “awareness”. The algorithm is less likely to recommend a “big” movie, one you’ve probably already heard of, as it tends to reward highly unknown independent films as a means to promote diversity within the recommendations [12]. Not only does this promote new “experiences” for Netflix’s users, this is also a way to solve a common problem for Recommendation Algorithms: people don’t usually explore products beyond their area of comfort. By promoting such diversity in their suggestions, Netflix ensures that users will try new stuff and rank it accordingly, thus fine tuning their models even more. This whole statistical recommendation scheme has been so successful that Netflix claims that about 75% of what people watch is from some sort of recommendation, driving watching time and consumer satisfaction up [12].

After reading this, you probably believe that Recommendation Algorithms are the only way to go for online companies. You’d be right, most online based industries rely heavily on recommendation algorithms as their edge against competition. Another streaming powerhouse, Spotify, has statistical models attached to its discover list algorithm [13]. But you’d be wrong to believe that only online-based industries will benefit from statistical models these upcoming Valentine, as brick-and-mortar stores are also adapting themselves for the use of highly complex algorithms as well.

Meet TrueFit, a private company that uses models to help you decide whether an outfit is good for you. That’s right, it is a statistically driven “What not to wear” [14]. TrueFit mines data from millions of clothing products, gathering specs, and combining them with style attributes. Then, with the use of statistical models, they compare those attributes with human body type, to make predictions of whether the two are a match, or in other words, a good fit. The service has over 22 million registered users who took a short quiz describing themselves, their bodies and tastes, thus saving their results to create a profile [14].

TrueFit ensures that the use of their model produces not only increased sales for their clients but also fewer returns of products. Surprisingly enough, it has been traditional retail that has shown the most interest in such clothing algorithms. Brands such as Macy’s, Levi’s, Under Armor, Guess, American eagle, Dorothy Perkins and Kenneth Cole are all working closely with them to improve their online shopping experiences and, perhaps not surprisingly, their retail efforts [15]. Macy’s has been particularly welcoming to these new data-driven approaches [16]. They have not only updated their online store to embrace TrueFit’s recommendations, they also have included it in their mobile app, to ensure the customers experience in-store can take advantage of the recommendations. They even announced in-store kiosks where the users can get personalized suggestions from the algorithm.

Considering the increasing developments and investment in recommendation algorithms, by now you have surely heard a dozen times that the future will be entirely data-driven for private companies. In fact, this isn’t the future, it is happening today. Every single day, more and more businesses are relying so heavily on statistical analyses that in a couple of years it will be no longer an advantage, but an overall necessity. But hey, what do I know? Let’s better listen to Rachel Bogan, partner at the agency Work & Co: “The companies who rise to the top will be the ones who figure out the perfect balance of data, customer service and user experience” [14]. And surely, those statistics-based companies will be the ones that will earn your hard-earned cash in the following Valentine’s Day.


[1] 14 Irresistible Valentine’s Day Search Stats From Bing .Search Engine Journal (January, 2017)
[2] Halzack, Sarah Are retailers in for a good holiday season? Here’s what we know so far. The Washington Post (September, 2016)
[3] Mackenzie Ian et al. How retailers can keep up with consumers. McKinsey & Company Website (October, 2013)
[4] Krawiec, Tom. The Amazon Recommendations Secret to Selling More Online. Rejoiner Website (2016)
[5] Mangalindan, JP. Amazon’s recommendation secret. Fortune Website (July, 2012)
[6] Ridwan, Mahmud. Predicting Likes: Inside A Simple Recommendation Engine’s Algorithms. Toptal Website
[7] Linden, Greg et al. Recommendations. Item-to-Item Collaborative Filtering. Industry Report (January, 2003)
[8] Gilbert, G. K. 1884. Finley’s tornado predictions. American Meteorological Journal 1: 166–172.
[9] Galway, Joseph G. J.P. Finley the first severe storm forecaster. Bulletin American Meteorological Society Vol. 66 No. 12 (December, 1985)
[10] L. A. Goodman, W. H. Kruskal. Measures of Association for Cross Classifications. Springer Series in Statistics, (1979) ISBN 978-1-4612-9995-0
[11] O’Reilly, Lara. Netflix lifted the lid on how the algorithm that recommends you titles to watch actually works. Business Insider Website (February, 2016)
[12] Amatriain, Xavier. Netflix Recommendations: Beyond the 5 stars (Point 1). The Netflix Tech Blog (April, 2012)