There might have been a time in life when you would’ve had some Single Malt whisky and thought that it “doesn’t taste like any other”. In fact, you might have noticed that some single malt whiskies are more distinct than others. It is possible you might want to go on a quest to find the most unique single malts, but given that single malts are expensive and not easily available, some data analysis might help.
There is this dataset of 86 single malts that has been floating about the interwebs for a while now, and there is some simple yet interesting analysis related to that data – for example, check out this simple analysis with a K-means clustering of various single malts. They use the dataset (which scores each of the 86 malts on 12 different axis) in order to cluster the malts, and analyze which whiskies belong to similar groups.
Can we use the same data set to figure out which is the most unique whisky? It is not hard – all we need to do is to define a distance measure, and then find out the “distance” between each pair of single malts. Then, what we can do is to find the average distance of each single malt from the others using this measure, and then rank the drinks in order to find the most distinctive ones.
Given that we have a 12-dimensional data set, all of which are ranked on a 0-4 score, the best distance measure is the “cosine similarity“. If each whisky is considered to be a vector (of ratings on each of the 12 dimensions), the cosine distance between two whiskies is the cosine of the angle between the 12-dimensional vectors representing the two whiskies.
We use this measure of distance here since we are primarily concerned about the “direction of the vector” rather than the magnitude – if a whisky has twice the score on every feature as another whisky, we can simply have two units of the latter (not strictly, since they are ratings, but you get the drift) to simulate the former, and they are pretty much the same! In fact, given that all dimensions have similar scales, it is unlikely that other measures of distance will produce vastly different results.
So which is the most distinctive whisky? Which single malt has the least “distance” to others? We follow the procedure described above – find out the distance between every pair of whiskies, and for each whisky, find the average (we present both mean and median, just for kicks!) distance between the particular whisky and every other whisky. Figure 1 shows the mean distance between the 10 “most unique” whiskies and the rest of the whiskies.
It is the extremely smoky Laphroaig that comes out on top, followed by Argbeg and Lagavulin (neither of which I’ve ever had). What is interesting is the large gap between numbers 3 and 4 in the table – there is an 8 percentage point difference in the distance between Lagavulin and Clynelish (at number 4).
The question arises, however, as to whether this is the correct measure. Should we look at average distance between a particular whisky and every other whisky? Does this really tell us how unique this whisky is? Another measure would be to look at a whisky’s “nearest neighbours” (in terms of cosine distance rather than geographical distance) and see which whisky is farthest from its “nearest neighbour”? This produces some measure of “local rarity” – the whisky that is the “winner” on this measure is the one that is farthest in the vector space.
On this metric, the league table looks rather different – Glendronach is the whisky that has the “farthest nearest neighbour”. Figure 2 has the top 10 entries in this league table:
It is all good to know that Glendronach is farthest from other whiskies than any other Scotch Whisky, and that Laphroaig is farther from the rest of the single malts than any other single malts. However, the next time you are passing through duty free and want to get something new but not dissimilar from your favourite whisky, how do you know which one to get?
Fear not, for Figure 3 shows you the three “nearest neighbours” for each whisky along with a degree of similarity. I’ve used my judgment and given you the nearest neighbour choices for only the more popular whiskies (basically, things that i’ve either consumed or seen in duty free shops). In case you don’t find your choice of malt in this list, fear not. You can write to me and I’ll send you the entire similarity matrix!