Correlation and causation

So I have this lecture on “smelling (statistical) bullshit” that I’ve delivered in several places, which I inevitably start with a lesson on how correlation doesn’t imply causation. I give a large number of examples of people mistaking correlation for causation, the class makes fun of everything that doesn’t apply to them, then everyone sees this wonderful XKCD cartoon and then we move on.

One of my favourite examples of correlation-causation (which I don’t normally include in my slides) has to do with religion. Praying before an exam in which one did well doesn’t necessarily imply that the prayer resulted in the good performance in the exam, I explain. So far, there has been no outward outrage at my lectures, but this does visibly make people uncomfortable.

Going off on a tangent, the time in life when I discovered to myself that I’m not religious was when I pondered over the correlation-causation issue some six or seven years back. Until then I’d had this irrational need to draw a relationship between seemingly unrelated things that had happened together once or twice, and that had given me a lot of mental stress. Looking at things from a correlation-causation perspective, however, helped clear up my mind on those things, and also made me believe that most religious activity is pointless. This was a time in life when I got immense mental peace.

Yet, for most of the world, it is not freedom from religion but religion itself that gives them mental peace. People do absurd activities only because they think these activities lead to other good things happening, thanks to a small number of occasions when these things have coincided, either in their own lives or in the lives of their ancestors or gurus.

In one of my lectures a few years back I had remarked that one reason why humans still mistake correlation for causation is religion – for if correlation did not imply causation then most of religious rituals would be rendered meaningless and that would render people’s lives meaningless. Based on what I observed today, however, I think I’ve got this causality wrong.

It’s not because of religion that people mistake correlation for causation. Instead, we’ve evolved to recognise patterns whenever we observe them, and a side effect of that is that we immediately assume causation whenever we see things happening together. Religion is just a special case of application of this correlation-causation second nature to things in real life.

So my daughter (who is two and a half) and I were standing in our balcony this evening, observing that it had rained heavily last night. Heavy rain reminded my daughter of this time when we had visited a particular aunt last week – she clearly remembered watching the heavy rain from this aunt’s window. Perhaps none of our other visits to this aunt’s house really registered in the daughter’s imagination (it’s barely two months since we returned to Bangalore, so admittedly there aren’t that many data points), so this aunt’s house is inextricably linked in her mind to rain.

And this evening because she wanted it to rain heavily again, the daughter suggested that we go visit this aunt once again. “We’ll go to Inna Ajji’s house and then it will start raining”, she kept saying. “Yes, it rained the last time it went there, but it was random. It wasn’t because we went there”, I kept saying. It wasn’t easy to explain it.

You know when you are about to have a kid you develop visions of how you’ll bring her up, and what you’ll teach her, and what she’ll say to “jack” the world. Back then I’d decided that I’d teach my yet-unborn daughter that “correlation does not imply causation” and she could use it use it against “elders” who were telling her absurd stuff.

I hadn’t imagined that mistaking correlation for causation is so fundamental to human nature that it would be a fairly difficult task to actually teach my daughter that correlation does not imply causation! Hopefully in the next one year I can convince her.

What Ails Liverpool

So Liverpool FC has had a mixed season so far. They’re second in the Premier League with 36 points from 14 games (only points dropped being draws against ManCity, Chelsea and Arsenal), but are on the verge of going out of the Champions League, having lost all three away games.

Yesterday’s win over Everton was damn lucky, down to a 96th minute freak goal scored by Divock Origi (I’d forgotten he’s still at the club). Last weekend’s 3-0 against Watford wasn’t as comfortable as the scoreline suggested, the scoreline having been opened only midway through the second half. The 2-0 against Fulham before that was similarly a close-fought game.

Of concern to most Liverpool fans has been the form of the starting front three – Mo Salah, Roberto Firmino and Sadio Mane. The trio has missed a host of chances this season, and the team has looked incredibly ineffective in the away losses in the Champions League (the only shot on target in the 2-1 loss against PSG being the penalty that was scored by Milner).

There are positives, of course. The defence has been tightened considerably compared to last season. Liverpool aren’t leaking goals the way they did last season. There have been quite a few clean sheets so far this season. So far there has been no repeat of last season’s situation where they went 4-1 up against ManCity, only to quickly let in two goals and then set up a tense finish.

So my theory is this – each of the front three of Liverpool has an incredibly low strike rate. I don’t know if the xG stat captures this, but the number of chances required by each of Mane, Salah and Firmino before they can convert is rather low. If the average striker converts one in two chances, all of these guys convert one in four (these numbers are pulled out of thin air. I haven’t looked at the statistics).

And even during the “glory days” of last season when Liverpool was scoring like crazy, this low strike rate remained. Instead, what helped then was a massive increase in the number of chances created. The one game I watched live (against Spurs at Wembley), what struck me was the number of chances Salah kept missing. But as the chances kept getting created, he ultimately scored one (Liverpool lost 4-1).

What I suspect is that as Klopp decided to tighten things up at the back this season, the number of chances being created has dropped. And with the low strike rate of each of the front three, this lower number of chances translates into much lower number of goals being scored. If we want last season’s scoring rate, we might also have to accept last season’s concession rate (though this season’s goalie is much much better).

There ain’t no such thing as a free lunch.

Randomness and sample size

I have had a strange relationship with volleyball, as I’ve documented here. Unlike in most other sports I’ve played, I was a rather defensive volleyball player, excelling in backline defence, setting and blocking, rather than spiking.

The one aspect of my game which was out of line with the rest of my volleyball, but in line with my play in most other sports I’ve played competitively, was my serve. I had a big booming serve, which at school level was mostly unreturnable.

The downside of having an unreturnable serve, though, is that you are likely to miss your serve more often than the rest – it might mean hitting it too long, or into the net, or wide. And like in one of the examples I’ve quoted in my earlier post, it might mean not getting a chance to serve at all, as the warm up serve gets returned or goes into the net.

So I was discussing my volleyball non-career with a friend who is now heavily involved in the game, and he thought that I had possibly been extremely unlucky. My own take on this is that given how little I played, it’s quite likely that things would have gone spectacularly wrong.

Changing domains a little bit, there was a time when I was building strategies for algorithmic trading, in a class known as “statistical arbitrage”. The deal there is that you have a small “edge” on each trade, but if you do a large enough number of trades, you will make money. As it happened, the guy I was working for then got spooked out after the first couple of trades went bad and shut down the strategy at a heavy loss.

Changing domains a little less this time, this is also the reason why you shouldn’t check your portfolio too often if you’re investing for the long term – in the short run, when there have been “fewer plays”, the chances of having a negative return are higher even if you’re in a mostly safe strategy, as I had illustrated in this blog post in 2008 (using the Livejournal URL since the table didn’t port well to wordpress).

And changing domains once again, the sheer number of “samples” is possibly one reason that the whole idea of quantification of sport and “SABRmetrics” first took hold in baseball. The Major League Baseball season is typically 162 games long (and this is before the playoffs), which means that any small edge will translate into results in the course of the league. A smaller league would mean fewer games and thus more randomness, and a higher chance that a “better play” wouldn’t work out.

This also explains why when “Moneyball” took off with the Oakland A’s in the 1990s, they focussed mainly on league performance and not performance in the playoffs – in the latter, there are simply not enough “samples” for a marginal advantage in team strength to necessarily have the impact in terms of results.

And this is the problem with newly appointed managers of elite football clubs in Europe “targeting the Champions League” – a knockout tournament of that format means that the best team need not always win. Targeting a national league, played out over at least 34 games in the season is a much better bet.

Finally, there is also the issue of variance. A higher variance in performance means that observations of a few instances of bad performance is not sufficient to conclude that the player is a bad performer – a great performance need not be too far away. For a player with less randomness in performance – a more steady player, if you will – a few bad performances will tell you that they are unlikely to come good. High risk high return players, on the other hand, need to be given a longer rope.

I’d put this in a different way in a blog a few years back, about Mitchell Johnson.

Religion and survivorship bias

Biju Dominic of FinalMile Consulting has a piece in Mint about “what CEOs can learn from religion“. In that, he says,

Despite all the hype, the vast majority of these so-called highly successful, worthy of being emulated companies, do not survive even for few decades. On the other hand, religion, with all its inadequacies, continues to survive after thousands of years.

This is a fallacious comparison.

Firstly, comparing “religion” to a particular company isn’t dimensionally consistent. A better comparison would be to compare at the conceptual level – such as comparing “religion” to “joint stock company”. And like the former, the latter has done rather well for 300 years now, even if specific companies may fold up after a few years.

The other way to make an apples-to-apples comparison is to compare a particular company to a particular religion. And this is where survivorship bias comes in.

Most of the dominant religions of today are more than hundreds or thousands of years old. In the course of their journey to present-day strength, they have first established their own base and then fought off competition from other upstart religions.

In other words, when Dominic talks about “religion” he is only taking into account religions that have displayed memetic fitness over a really long period. What he fails to take account of are the thousands of startup religions that get started up once every few years and then fade into nothingness.

Historically, such religions haven’t been well documented, but that doesn’t mean they didn’t exist. In contemporary times, one can only look at the thousands of “babas” with cults all around India – each is leading his/her own “startup religion”, and most of them are likely to sink without a trace.

Comparing the best in one class (religions that have survived and thrived over thousands of years) to the average of another class (the average corporation) just doesn’t make sense!

Astrology and Data Science

The discussion goes back some 6 years, when I’d first started setting up my data and management consultancy practice. Since I’d freshly quit my job to set up the said practice, I had plenty of time on my hands, and the wife suggested that I spend some of that time learning astrology.

Considering that I’ve never been remotely religious or superstitious, I found this suggestion preposterous (I had a funny upbringing in the matter of religion – my mother was insanely religious (including following a certain Baba), and my father was insanely rationalist, and I kept getting pulled in both directions).

Now, the wife has some (indirect) background in astrology. One of her aunts is an astrologer, and specialises in something called “prashNa shaastra“, where the prediction is made based on the time at which the client asks the astrologer a question. My wife believes this has resulted in largely correct predictions (though I suspect a strong dose of confirmation bias there), and (very strangely to me) seems to believe in the stuff.

“What’s the use of studying astrology if I don’t believe in it one bit”, I asked. “Astrology is very mathematical, and you are very good at mathematics. So you’ll enjoy it a lot”, she countered, sidestepping the question.

We went off into a long discussion on the origins of astrology, and how it resulted in early developments in astronomy (necessary in order to precisely determine the position of planets), and so on. The discussion got involved, and involved many digressions, as discussions of this sort might entail. And as you might expect with such discussions, my wife threw a curveball, “You know, you say you’re building a business based on data analysis. Isn’t data analysis just like astrology?”

I was stumped (ok I know I’m mixing metaphors here), and that had ended the discussion then.

Until I decided to bring it up recently. As it turns out, once again (after a brief hiatus when I decided I’ll do a job) I’m in process of setting up a data and management consulting business. The difference is this time I’m in London, and that “data science” is a thing (it wasn’t in 2011). And over the last year or so I’ve been kinda disappointed to see what goes on in the name of “data science” around me.

This XKCD cartoon (which I’ve shared here several times) encapsulates it very well. People literally “pour data into a machine learning system” and then “stir the pile” hoping for the results.

In the process of applying fairly complex “machine learning” algorithms, I’ve seen people not really bother about whether the analysis makes intuitive sense, or if there is “physical meaning” in what the analysis says, or if the correlations actually determine causation. It’s blind application of “run the data through a bunch of scikit learn models and accept the output”.

And this is exactly how astrology works. There are a bunch of predictor variables (position of different “planets” in various parts of the “sky”). There is the observed variable (whether some disaster happened or not, basically), which is nicely in binary format. And then some of our ancients did some data analysis on this, trying to identify combinations of predictors that predicted the output (unfortunately they didn’t have the power of statistics or computers, so in that sense the models were limited). And then they simply accepted the outputs, without challenging why it makes sense that the position of Jupiter at the time of wedding affects how your marriage will go.

So I brought up the topic of astrology and data science again recently, saying “OK after careful analysis I admit that astrology is the oldest form of data science”. “That’s not what I said”, the wife countered. “I said that data science is new age astrology, and not the other way round”.

It’s hard to argue with that!

Biases, statistics and luck

Tomorrow Liverpool plays Manchester City in the Premier League. As things stand now I don’t plan to watch this game. This entire season so far, I’ve only watched two games. First, I’d gone to a local pub to watch Liverpool’s visit to Manchester City, back in September. Liverpool got thrashed 5-0.

Then in October, I went to Wembley to watch Tottenham Hotspur play Liverpool. The Spurs won 4-1. These two remain Liverpool’s only defeats of the season.

I might consider myself to be a mostly rational person but I sometimes do fall for the correlation-implies-causation bias, and think that my watching those games had something to do with Liverpool’s losses in them. Never mind that these were away games played against other top sides which attack aggressively. And so I have this irrational “fear” that if I watch tomorrow’s game (even if it’s from a pub), it might lead to a heavy Liverpool defeat.

And so I told Baada, a Manchester City fan, that I’m not planning to watch tomorrow’s game. And he got back to me with some statistics, which he’d heard from a podcast. Apparently it’s been 80 years since Manchester City did the league “double” (winning both home and away games) over Liverpool. And that it’s been 15 years since they’ve won at Anfield. So, he suggested, there’s a good chance that tomorrow’s game won’t result in a mauling for Liverpool, even if I were to watch it.

With the easy availability of statistics, it has become a thing among football commentators to supply them during the commentary. And from first hearing, things like “never done this in 80 years” or “never done that for last 15 years” sounds compelling, and you’re inclined to believe that there is something to these numbers.

I don’t remember if it was Navjot Sidhu who said that statistics are like a bikini (“what they reveal is significant but what they hide is crucial” or something). That Manchester City hasn’t done a double over Liverpool in 80 years doesn’t mean a thing, nor does it say anything that they haven’t won at Anfield in 15 years.

Basically, until the mid 2000s, City were a middling team. I remember telling Baada after the 2007 season (when Stuart Pearce got fired as City manager) that they’d be surely relegated next season. And then came the investment from Thaksin Shinawatra. And the appointment of Sven-Goran Eriksson as manager. And then the youtube signings. And later the investment from the Abu Dhabi investment group. And in 2016 the appointment of Pep Guardiola as manager. And the significant investment in players after that.

In other words, Manchester City of today is a completely different team from what they were even 2-3 years back. And they’re surely a vastly improved team compared to a decade ago. I know Baada has been following them for over 15 years now, but they’re unrecognisable from the time he started following them!

Yes, even with City being a much improved team, Liverpool have never lost to them at home in the last few years – but then Liverpool have generally been a strong team playing at home in these years! On the other hand, City’s 18-game winning streak (which included wins at Chelsea and Manchester United) only came to an end (with a draw against Crystal Palace) rather recently.

So anyways, here are the takeaways:

1. Whether I watch the game or not has no bearing on how well Liverpool will play. The instances from this season so far are based on 1. small samples and 2. biased samples (since I’ve chosen to watch Liverpool’s two toughest games of the season)
2. 80-year history of a fixture has no bearing since teams have evolved significantly in these 80 years. So saying a record stands so long has no meaning or predictive power for tomorrow’s game.
3. City have been in tremendous form this season, and Liverpool have just lost their key player (by selling Philippe Coutinho to Barcelona), so City can fancy their chances. That said, Anfield has been a fortress this season, so Liverpool might just hold (or even win it).

All of this points to a good game tomorrow! Maybe I should just watch it!

Lessons from poker party

In the past I’ve drawn lessons from contract bridge on this blog – notably, I’d described a strategy called “queen of hearts” in order to maximise chances of winning in a game that is terribly uncertain. Now it’s been years since I played bridge, or any card game for that matter. So when I got invited for a poker party over the weekend, I jumped at the invitation.

This was only the second time ever that I’d played poker in a room – I’ve mostly played online where there are no monetary stakes and you see people go all in on every hand with weak cards. And it was a large table, with at least 10 players being involved in each hand.

A couple of pertinent observations (reasonable return for the £10 I lost that night).

Firstly a windfall can make you complacent. I’m usually a conservative player, bidding aggressively only when I know that I have good chances of winning. I haven’t played enough to have mugged up all the probabilities – that probably offers an edge to my opponents. But I have a reasonable idea of what constitutes a good hand and bid accordingly.

My big drawdown happened in the hand immediately after I’d won big. After an hour or so of bleeding money, I’d suddenly more than broken even. That meant that in my next hand, I bid a bit more aggressively than I would have for what I had. For a while I managed to stay rational (after the flop I knew I had a 1/6 chance of winning big, and having mugged up the Kelly Criterion on my way to the party, bid accordingly).

And when the turn wasn’t to my liking I should’ve just gotten out – the (approx) percentages didn’t make sense any more. But I simply kept at it, falling for the sunk cost fallacy (what I’d put in thus far in the hand). I lost some 30 chips in that one hand, of which at least 21 came at the turn and the river. Without the high of having won the previous hand, I would’ve played more rationally and lost only 9. After all the lectures I’ve given on logic, correlation-causation and the sunk cost fallacy, I’m sad I lost so badly because of the last one.

The second big insight is that poverty leads to suboptimal decisions. Now, this is a well-studied topic in economics but I got to experience it first hand during the session. This was later on in the night, as I was bleeding money (and was down to about 20 chips).

I got pocket aces (a pair of aces in hand) – something I should’ve bid aggressively with. But with the first 3 open cards falling far away from the face cards and being uncorrelated, I wasn’t sure of the total strength of my hand (mugging up probabilities would’ve helped for sure!). So when I had to put in 10 chips to stay in the hand, I baulked, and folded.

Given the play on the table thus far, it was definitely a risk worth taking, and with more in the bank, I would have. But poverty and the Kelly Criterion meant that the number of chips that I was able to invest in the arguably strong hand was limited, and that limited my opportunity to profit from the game.

It is no surprise that the rest of the night petered out for me as my funds dwindled and my ability to play diminished. Maybe I should’ve bought in more when I was down to 20 chips – but then given my ability relative to the rest of the table, that would’ve been good money after bad.

Scott Adams, careers and correlation

I’ve written here earlier about how much I’ve been influenced by Scott Adams’s career advice about “being in top quartile of two or more things“.  To recap, this is what Adams wrote nearly ten years back:

If you want an average successful life, it doesn’t take much planning. Just stay out of trouble, go to school, and apply for jobs you might like. But if you want something extraordinary, you have two paths:

1. Become the best at one specific thing.
2. Become very good (top 25%) at two or more things.

The first strategy is difficult to the point of near impossibility. Few people will ever play in the NBA or make a platinum album. I don’t recommend anyone even try.

Having implemented this to various degrees of success over the last 5-6 years, I propose a small correction – basically to follow the second strategy that Adams has mentioned, you need to take correlation into account.

Basically there’s no joy in becoming very good (top 25%) at two or more correlated things. For example, if you think you’re in the top 25% in terms of “maths and physics” or “maths and computer science” there’s not so much joy because these are correlated skills. Lots of people who are very good at maths are also very good at physics or computer science. So there is nothing special in being very good at such a combination.

Why Adams succeeded was that he was very good at 2-3 things that are largely uncorrelated – drawing, telling jokes and understanding corporate politics are not very correlated to each other. So the combination of these three skills of his was rather unique to find, and their combination resulted in the wildly successful Dilbert.

So the key is this – in order to be wildly successful, you need to be very good (top 25%) at two or three things that are not positively correlated with each other (either orthogonal or negative correlation works). That ensures that if you can put them together, you can offer something that very few others can offer.

Then again, the problem there is that the market for this combination of skills will be highly illiquid – low supply means people who might demand such combinations would have adapted to make do with some easier to find substitute, so demand is lower, and so on. So in that sense, again, it’s a massive hit-or-miss!

When I missed my moment in the sun

Going through an old piece I’d written for Mint, while conducting research for something I’m planning to write, I realise that I’d come rather close to staking claim as a great election forecaster. As it happened, I just didn’t have the balls to stick my neck out (yes, mixed metaphors and all that) and so I missed the chance to be a hero.

I was writing a piece on election forecasting, and the art of converting vote shares into seat shares, which is tricky business in a first past the post system such as India. I was trying to explain how the number of “corners of contests” can have an impact on what seat share a particular vote share can translate to, and I wrote about Uttar Pradesh.

Quoting from my article:

An opinion poll conducted by CNN-IBN and CSDS whose results were published last week predicted that in Uttar Pradesh, the Bharatiya Janata Party is likely to get 38% of the vote. The survey reported that this will translate to about 41-49 seats for the BJP. What does our model above say?

If you look at the graph for the four-cornered contest closely (figure 4), you will notice that 38% vote share literally falls off the chart. Only once before has a party secured over 30% of the vote in a four-cornered contest (Congress in relatively tiny Haryana in 2004, with 42%) and on that occasion went on to get 90% of the seats (nine out of 10).

Given that this number (38%) falls outside the range we have noticed historically for a four-cornered contest, it makes it unpredictable. What we can say, however, is that if a party can manage to get 38% of the votes in a four-cornered state such as Uttar Pradesh, it will go on to win a lot of seats.

As it turned out, the BJP did win nearly 90% of all seats in the state (71 out of 80 to be precise), stumping most election forecasters. As you can see, I had it all right there, except that I didn’t put it in that many words – I chickened out by saying “a lot of seats”. And so I’m still known as “the guy who writes on election data for Mint” rather than “that great election forecaster”.

Then again, you don’t want to be too visible with the predictions you make, and India’s second largest business newspaper is definitely not an “obscure place”. As I’d written a long time back regarding financial forecasts,

…take your outrageous prediction and outrageous reasons and publish a paper. It should ideally be in a mid-table journal – the top journals will never accept anything this outrageous, and you won’t want too much footage for it also.

In all probability your prediction won’t come true. Remember – it was outrageous. No harm with that. Just burn that journal in your safe (I mean take it out of the safe before you burn it). There is a small chance of your prediction coming true. In all likelihood it wont, but just in case it does, pull that journal out of that safe and call in your journalist friends. You will be the toast of the international press.

So maybe choosing to not take the risk with my forecast was a rational decision after all. Just that it doesn’t appear so in hindsight.

Damming the Nile and diapers

One of the greatest engineering problems in the last century was to determine the patterns in the flow of the Nile. It had been clear for at least a couple of millennia that the flow of the river was not regular, and the annual flow did not follow something like a normal distribution.

The matter gained importance in the late 1800s when the British colonial government decided to dam the Nile. Understanding accurately the pattern of flows of the river was important to determine the capacity of the reservoir being built, so that both floods and droughts could be contained.

The problem was solved by Harold Edwin Hurst, a British hydrologist who was posted in Egypt for over 60 years in the 20th Century. Hurst defined his model as one of “long-range dependence”, and managed to accurately predict the variation in the flow of the river. In recognition of his services, Egyptians gave him the moniker “Abu Nil” (father of the Nile). Later on, Benoit Mandelbrot named a quantity that determines the long-range dependence of a time series after Hurst.

I’ve written about Hurst once before, in the context of financial markets, but I invoke him here with respect to a problem closer to me – the pattern of my daughter’s poop.

It is rather well known that poop, even among babies, is not a continuous process. If someone were to poop 100ml of poop a day (easier to use volume rather than weight in the context of babies), it doesn’t mean they poop 4ml every hour. Poop happens in discrete bursts, and the number of such bursts per day depends upon age, decreasing over time into adulthood.

One might think that a reasonable way to model poop is to assume that the amount of poop in each burst follows a normal distribution, and each burst is independent of the ones around it. However, based on a little over two months’ experience of changing my daughter’s diapers, I declare this kind of a model to be wholly inaccurate.

For, what I’ve determined is that far from being normal, pooping patterns follow long-range dependence. There are long time periods (spanning a few diaper changes) when there is no, or very little, poop. Then there are times when it flows at such a high rate that we need to change diapers at a far higher frequency than normal. And such periods are usually followed by other high-poop periods. And so on.

In other words, the amount of poop has positive serial correlation. And to use the index that Mandelbrot lovingly constructed and named in honour of Hurst, the Hurst exponent of my daughter’s (and other babies’) poop is much higher than 0.5.

This makes me wonder if diaper manufacturers have taken this long-range dependence into account while determining diaper capacity. Or I wonder if, instead, they simply assume that parents will take care of this by adjusting the inter-diaper-change time period.

As Mandelbrot describes towards the end of his excellent Misbehaviour of markets , you can  use so-called “multifractal models” which combine normal price increments with irregular time increments to get an accurate (fractal) representation of the movement of stock prices.

PS: Apologies to those who got disgusted by the post. Until a massive burst a few minutes ago I’d never imagined I’d be comparing the flows of poop and the Nile!