## Correlation and causation

So I have this lecture on “smelling (statistical) bullshit” that I’ve delivered in several places, which I inevitably start with a lesson on how correlation doesn’t imply causation. I give a large number of examples of people mistaking correlation for causation, the class makes fun of everything that doesn’t apply to them, then everyone sees this wonderful XKCD cartoon and then we move on.

One of my favourite examples of correlation-causation (which I don’t normally include in my slides) has to do with religion. Praying before an exam in which one did well doesn’t necessarily imply that the prayer resulted in the good performance in the exam, I explain. So far, there has been no outward outrage at my lectures, but this does visibly make people uncomfortable.

Going off on a tangent, the time in life when I discovered to myself that I’m not religious was when I pondered over the correlation-causation issue some six or seven years back. Until then I’d had this irrational need to draw a relationship between seemingly unrelated things that had happened together once or twice, and that had given me a lot of mental stress. Looking at things from a correlation-causation perspective, however, helped clear up my mind on those things, and also made me believe that most religious activity is pointless. This was a time in life when I got immense mental peace.

Yet, for most of the world, it is not freedom from religion but religion itself that gives them mental peace. People do absurd activities only because they think these activities lead to other good things happening, thanks to a small number of occasions when these things have coincided, either in their own lives or in the lives of their ancestors or gurus.

In one of my lectures a few years back I had remarked that one reason why humans still mistake correlation for causation is religion – for if correlation did not imply causation then most of religious rituals would be rendered meaningless and that would render people’s lives meaningless. Based on what I observed today, however, I think I’ve got this causality wrong.

It’s not because of religion that people mistake correlation for causation. Instead, we’ve evolved to recognise patterns whenever we observe them, and a side effect of that is that we immediately assume causation whenever we see things happening together. Religion is just a special case of application of this correlation-causation second nature to things in real life.

So my daughter (who is two and a half) and I were standing in our balcony this evening, observing that it had rained heavily last night. Heavy rain reminded my daughter of this time when we had visited a particular aunt last week – she clearly remembered watching the heavy rain from this aunt’s window. Perhaps none of our other visits to this aunt’s house really registered in the daughter’s imagination (it’s barely two months since we returned to Bangalore, so admittedly there aren’t that many data points), so this aunt’s house is inextricably linked in her mind to rain.

And this evening because she wanted it to rain heavily again, the daughter suggested that we go visit this aunt once again. “We’ll go to Inna Ajji’s house and then it will start raining”, she kept saying. “Yes, it rained the last time it went there, but it was random. It wasn’t because we went there”, I kept saying. It wasn’t easy to explain it.

You know when you are about to have a kid you develop visions of how you’ll bring her up, and what you’ll teach her, and what she’ll say to “jack” the world. Back then I’d decided that I’d teach my yet-unborn daughter that “correlation does not imply causation” and she could use it use it against “elders” who were telling her absurd stuff.

I hadn’t imagined that mistaking correlation for causation is so fundamental to human nature that it would be a fairly difficult task to actually teach my daughter that correlation does not imply causation! Hopefully in the next one year I can convince her.

## English Premier League: Goal Difference to points correlation

So I was just looking down the English Premier League Table for the season, and I found that as I went down the list, the goal difference went lower. There’s nothing counterintuitive in this, but the degree of correlation seemed eerie.

So I downloaded the data and plotted a scatter-plot. And what do you have? A near-perfect regression. I even ran the regression and found a 96% R Square.

In other words, this EPL season has simply been all about scoring lots of goals and not letting in too many goals. It’s almost like the distribution of the goals itself doesn’t matter – apart from the relegation battle, that is!

PS: Look at the extent of Manchester City’s lead at the top. And what a scrap the relegation is!

## Biases, statistics and luck

Tomorrow Liverpool plays Manchester City in the Premier League. As things stand now I don’t plan to watch this game. This entire season so far, I’ve only watched two games. First, I’d gone to a local pub to watch Liverpool’s visit to Manchester City, back in September. Liverpool got thrashed 5-0.

Then in October, I went to Wembley to watch Tottenham Hotspur play Liverpool. The Spurs won 4-1. These two remain Liverpool’s only defeats of the season.

I might consider myself to be a mostly rational person but I sometimes do fall for the correlation-implies-causation bias, and think that my watching those games had something to do with Liverpool’s losses in them. Never mind that these were away games played against other top sides which attack aggressively. And so I have this irrational “fear” that if I watch tomorrow’s game (even if it’s from a pub), it might lead to a heavy Liverpool defeat.

And so I told Baada, a Manchester City fan, that I’m not planning to watch tomorrow’s game. And he got back to me with some statistics, which he’d heard from a podcast. Apparently it’s been 80 years since Manchester City did the league “double” (winning both home and away games) over Liverpool. And that it’s been 15 years since they’ve won at Anfield. So, he suggested, there’s a good chance that tomorrow’s game won’t result in a mauling for Liverpool, even if I were to watch it.

With the easy availability of statistics, it has become a thing among football commentators to supply them during the commentary. And from first hearing, things like “never done this in 80 years” or “never done that for last 15 years” sounds compelling, and you’re inclined to believe that there is something to these numbers.

I don’t remember if it was Navjot Sidhu who said that statistics are like a bikini (“what they reveal is significant but what they hide is crucial” or something). That Manchester City hasn’t done a double over Liverpool in 80 years doesn’t mean a thing, nor does it say anything that they haven’t won at Anfield in 15 years.

Basically, until the mid 2000s, City were a middling team. I remember telling Baada after the 2007 season (when Stuart Pearce got fired as City manager) that they’d be surely relegated next season. And then came the investment from Thaksin Shinawatra. And the appointment of Sven-Goran Eriksson as manager. And then the youtube signings. And later the investment from the Abu Dhabi investment group. And in 2016 the appointment of Pep Guardiola as manager. And the significant investment in players after that.

In other words, Manchester City of today is a completely different team from what they were even 2-3 years back. And they’re surely a vastly improved team compared to a decade ago. I know Baada has been following them for over 15 years now, but they’re unrecognisable from the time he started following them!

Yes, even with City being a much improved team, Liverpool have never lost to them at home in the last few years – but then Liverpool have generally been a strong team playing at home in these years! On the other hand, City’s 18-game winning streak (which included wins at Chelsea and Manchester United) only came to an end (with a draw against Crystal Palace) rather recently.

So anyways, here are the takeaways:

1. Whether I watch the game or not has no bearing on how well Liverpool will play. The instances from this season so far are based on 1. small samples and 2. biased samples (since I’ve chosen to watch Liverpool’s two toughest games of the season)
2. 80-year history of a fixture has no bearing since teams have evolved significantly in these 80 years. So saying a record stands so long has no meaning or predictive power for tomorrow’s game.
3. City have been in tremendous form this season, and Liverpool have just lost their key player (by selling Philippe Coutinho to Barcelona), so City can fancy their chances. That said, Anfield has been a fortress this season, so Liverpool might just hold (or even win it).

All of this points to a good game tomorrow! Maybe I should just watch it!

## Scott Adams, careers and correlation

I’ve written here earlier about how much I’ve been influenced by Scott Adams’s career advice about “being in top quartile of two or more things“.  To recap, this is what Adams wrote nearly ten years back:

If you want an average successful life, it doesn’t take much planning. Just stay out of trouble, go to school, and apply for jobs you might like. But if you want something extraordinary, you have two paths:

1. Become the best at one specific thing.
2. Become very good (top 25%) at two or more things.

The first strategy is difficult to the point of near impossibility. Few people will ever play in the NBA or make a platinum album. I don’t recommend anyone even try.

Having implemented this to various degrees of success over the last 5-6 years, I propose a small correction – basically to follow the second strategy that Adams has mentioned, you need to take correlation into account.

Basically there’s no joy in becoming very good (top 25%) at two or more correlated things. For example, if you think you’re in the top 25% in terms of “maths and physics” or “maths and computer science” there’s not so much joy because these are correlated skills. Lots of people who are very good at maths are also very good at physics or computer science. So there is nothing special in being very good at such a combination.

Why Adams succeeded was that he was very good at 2-3 things that are largely uncorrelated – drawing, telling jokes and understanding corporate politics are not very correlated to each other. So the combination of these three skills of his was rather unique to find, and their combination resulted in the wildly successful Dilbert.

So the key is this – in order to be wildly successful, you need to be very good (top 25%) at two or three things that are not positively correlated with each other (either orthogonal or negative correlation works). That ensures that if you can put them together, you can offer something that very few others can offer.

Then again, the problem there is that the market for this combination of skills will be highly illiquid – low supply means people who might demand such combinations would have adapted to make do with some easier to find substitute, so demand is lower, and so on. So in that sense, again, it’s a massive hit-or-miss!

## Writing and depression

It is now a well-documented fact (that I’m too lazy to google and provide links) that there exists a relationship between mental illness and creative professions such as writing.

Most pieces that talk about this relationship draw the causality in one way – that the mental illness helped the writer (or painter or filmmaker or whoever) focus and channel emotions into the product.

Having taken treatment for depression in the past, and having just finished a manuscript of a book, I might tend to agree that there exists a relationship between creativity and depression. However, I wonder if the causality runs the other way.

I’ve mentioned here a couple of months back that writing a book is hard because you are working months together with little tangible feedback, and there’s a real possibility that it might flop miserably. Soncequently, you put fight to make the product as good as you can.

In the absence of feedback, you are your greatest critic, and you read, and re-read what you’ve written; you edit, and re-edit your passages until you’re convinced that they’re as good as they can be.

You get obsessed with your product. You start thinking that if it’s not perfect it is all doomed. You downplay the (rather large) random component that might affect the success of the product, and instead focus on making it as perfect as you can.

And this obsession can drive you mad. There are days when you sit with your manuscript and feel useless. There are times when you want to chuck months’ effort down the drain. And that depresses you. And affects other parts of your life, mostly negatively!

Again it’s rather early that I’m writing this blog post now – at a time when I’m yet to start marketing my book to publishers. However, it’s important that I document this relationship and causality now – before either spectacular success or massive failure take me over!

## Correlation in defence purchases

Nitin Pai has a nice piece on defence procurement in Business Standard today. He writes:

Even if the planning process works as intended, it still means that the defence ministry merely adds up the individual requirements and goes about buying them. This is sub-optimal: consider a particular emerging threat that everyone agrees India needs to be prepared for. The army, navy and air force then prepare their own strategies and operational plans, for which they draw up a list of requirements. At the back of their minds, they know that the defence budget is more-or-less divided in a fixed ratio among them.

What he is saying, in other words, is that the defence ministry simply takes the arithmetic sum of demands from various components of the military, rather than taking correlation into account.

Let me explain using a toy example.

Let’s say that the Western wing of the Indian army (I’m making this up), the one that guards the border with Pakistan, wants 100 widgets that will come useful in case of a war. Let’s say that the Eastern wing of the Indian army, which guards the China border, wants 150 such widgets for the same purpose. The question is how many you should purchase.

According to Nitin, the defence ministry now doesn’t think. It simply adds up and buys 250. The question is if we actually need 250.

Let’s assume that these widgets are easily transportable, and let’s assume that the probability of a simultaneous conventional conflict with Pakistan and China is zero (given all three are nuclear states, this is a fair assumption). Do we still need 250 widgets? The answer is no, we only need 150, since we can quickly swing them over to where they are most required, and at the maximum, we need 150!

This is a case of negative correlation. There could be a case of positive correlation also – perhaps the chance of an India-China conventional conflict actually goes up when an India-Pakistan conventional conflict is on, and this might lead to more prolonged battles, meaning we might need more than 250 widgets! Or we have positive correlation.

The most famous example of ignoring correlation was the 2008 financial crisis, when ignored positive correlation led to mortgage backed securities and their derivatives blowing up. The Indian defence ministry can’t afford such a mistake.

## Narendra Modi and the Correlation Term

In a speech in Canada last night, Prime Minister Narendra Modi said that the relationship between India and Canada is like the “2ab term” in the formula for expansion of $(a+b)^2$.

Unfortunately for him, this has been widely lampooned on twitter, with some people seemingly not getting the mathematical reference, and others making up some unintended consequences of it.

In my opinion, however, it is a masterstroke, and brings to notice something that people commonly ignore – what I call as the “correlation term”. When any kind of break up or disagreement happens – like someone quitting a job, or a couple breaking up, or a band disbanding, people are bound to ask the question of whose fault it was. The general assumption is that if two entities did not agree, it was because both of them sucked.

However, considering the frequency at which such events (breakups or disagreements ) happen, and that people who are generally “good” are involved in such events, the badness of one of the parties involve simply cannot explain them. So the question arises – if both parties were flawless why did the relationship go wrong? And this is where the correlation term comes in!

It is rather easy to explain using vector calculus. If you have two vectors $A$ and $B$, the magnitude of the sum of the two vectors is given by $\sqrt{|A|^2 + |B|^2 + 2 |A||B| cos \theta}$ where $|A|,|B|$ are the magnitudes of the two vectors respectively and $\theta$ is the angle between them. It is easy to see from the above formula that the magnitude of the sum of the vectors is dependent not only on the magnitudes of the individual vectors, but also on the angle between them.

To illustrate with some examples, if A and B are perfectly aligned ($\theta = 0, cos \theta = 1$), then the magnitude of their vector sum is the sum of their magnitudes. If they oppose each other, then the magnitude of their vector sum is the difference of their magnitudes. And if A and B are orthogonal, then $cos \theta = 0$ or the magnitude of their vector sum is $\sqrt{|A|^2 + |B|^2}$.

And if we move from vector algebra to statistics, then if A and B represent two datasets, the “$cos \theta$” is nothing but the correlation between A and B. And in the investing world, correlation is a fairly important and widely used concept!

So essentially, the concept that the Prime Minister alluded to in his lecture in Canada is rather important, and while it is commonly used in both science and finance, it is something people generally disregard in their daily lives. From this point of view, kudos to the Prime Minister for bringing up this concept of the correlation term! And here is my interpretation of it:

At first I was a bit upset with Modi because he only mentioned “2ab” and left out the correlation term ($\theta$). Thinking about it some more, I reasoned that the reason he left it out was to imply that it was equal to 1, or that the angle between the a and b in this case (i.e. India and Canada’s interests) is zero, or in other words, that India and Canada’s interests are perfectly aligned! There could have been no better way of putting it!

So thanks to the Prime Minister for bringing up this rather important concept of correlation to public notice, and I hope that people start appreciating the nuances of the concept rather than brainlessly lampooning him!

## How my IIMB Class explains the 2008 financial crisis

I have a policy of not enforcing attendance in my IIMB class. My view is that it’s better to have a small class of dedicated students rather than a large class of students who don’t want to be there. One of the upsides of this policy is that there has been no in-class sleeping. Almost. I caught one guy sleeping last week, in what was session 16 (out of 20). Considering that my classes are between 8 and 9:30 am on Mondays and Tuesdays, I like to take credit for it.

I also like to take credit for the fact that despite not enforcing attendance, attendance has been healthy. There have usually been between 40 and 50 students in each class (yes, I count, when I’ve bamboozled them with a question and the class has gone all quiet), skewed towards the latter number. Considering that there are 60 students registered for the course, this translates to a pretty healthy percentage. So perhaps I’ve been doing something right.

The interesting thing to note is that where there are about 45 people in each class, it’s never the same set of 45. I don’t think there’s a single student who’s attended all of my classes. However, people appear and disappear in a kind of random uncoordinated fashion, and the class attendance has remained in the forties, until last week that is. This had conditioned me into expecting a rather large class each time I climbed up that long flight of stairs to get into class.

While there were many causes of the 2008 financial crisis, one of the prime reasons shit hit the fan then was that CDOs (collateralised debt obligations) blew up. CDOs were an (at one point in time) innovative way of repackaging receivables (home loans or auto loans or credit card bills) so as to create a set of instruments of varying credit ratings.

To explain it in the simplest way, let’s say I’ve lent money to a 100 people and each owes me a rupee each month. So I expect to get a hundred rupees each month. Now I carve it up into tranches and let’s say I promise Alice the “first 60 rupees” I receive each month. In return she pays me a fee. Bob will get the “next 20 rupees”, again for a fee. Note that if fewer than 60 people pay me this month, Bob gets nothing. Let’s say Eve gets the next 10 rupees, so in case less than 80 people pay up, Eve gets nothing. So this is very risky, and Eve pays much less for her tranche than Bob pays for his which is in turn much less than what Alice pays for hers. The last 10 rupees is so risky that no one will buy it and so I hold it.

Let’s assume that about 85 to 90 people have been paying on their loans each month. Not the same people, but different, like in my class. Both Alice and Bob are getting paid in full each month, and the return is pretty impressive considering the high ratings of the instruments they hold (yes these tranches got rated, and the best tranche (Alice’s) would typically get AAA, or as good as government bonds). So Alice and Bob make a fortune. Until the shit hits the fan that is.

The factor that led to healthy attendance in my IIMB class and what kept Alice and Bob getting supernormal returns was the same – “correlation”. The basic assumption in CDO markets was that home loans were uncorrelated – my default had nothing to do with your default. So both of us defaulting together is unlikely. When between 10 and 15 people are defaulting each month, that 40 (or even 20) people will default together in a given month has very low probability. Which is what kept Alice and Bob happy. It was similar in my IIMB class – the reason I bunk is uncorrelated to the reason you bunk, so lack of correlation in bunking means there is a healthy attendance in my class each day.

The problem in both cases, as you might have guessed, is that correlations started moving from zero to one. On Sunday and Monday night this week, they had “club selections” on IIMB campus. Basically IIMB has this fraud concept called clubs (which do nothing), which recruiters value for reasons I don’t know, and so students take them seriously. And each year’s officebearers are appointed by the previous year’s officebearers, and thus you have interviews. And so these interviews went on till late on Monday morning. People were tired, and some decided to bunk due to that. Suddenly, there was correlation in bunking! And attendance plummeted. Yesterday there were 10 people in class. Today perhaps 12. Having got used to a class of 45, I got a bit psyched out! Not much damage was done, though.

The damage was much greater in the other case. In 2008, the Federal Reserve raised rates, thanks to which banks increased rates on home loans. The worst borrowers defaulted, because of which home prices fell, which is when shit truly hit the fan. The fall in home prices meant that many homes were now worth less than the debt outstanding on them, so it became rational for homeowners to default on their loans. This meant that defaults were now getting correlated! And so rather than 85 people paying in a month, maybe 45 people paid. Bob got wiped out. Alice lost heavily, too.

This was not all. Other people had bet on how much Alice would get paid. And when she didn’t get paid in full, these people lost a lot of money. And then they defaulted. And it set off a cascade. No one was willing to trade with anyone any more. Lehman brothers couldn’t even put a value on the so-called “toxic assets” they held. The whole system collapsed.

It is uncanny how two disparate events such as people bunking my class and the 2008 financial crisis are correlated. And there – correlation rears its ugly head once again!

## The problem with precedence

One common bureaucratic practice across bureaucracies and across countries is that of “precedence”. If a certain action has “precedence” and the results of that preceding actions have been broadly good, that action immediately becomes kosher. However, from the point of view of logical consistency, there are several problems with this procedure.

The first issue is that of small samples – if there is a small number of times a certain action has been tried in the past, the degree of randomness associated with the result of that action is significant. Thus, relying on the result of a handful of instances of prior action is not likely to be reliable.

The second, and related, issue is that of correlation and causation. That the particular action in the past was associated with a particular result doesn’t necessarily mean that the result, whether good or bad, was a consequence of the action. The question we need to ask in this case is whether the result was because of or in spite of the action. It is well possible that a lousy policy in the past led to good results thanks to a favourable market environment. It is also equally possible that a fantastic policy led to lousy results because of a lousy environment.

Thus, when we evaluate precedence, we should evaluate the process and methodology, rather than result. We should accept that the action alone can never fully explain the result of the action, and thus evaluate the action in light of the prevailing conditions, etc. rather than just by the result.

It is going to take significant bureaucratic rethinking to accept this, but unless this happens it is unlikely that a bureaucracy can function effectively.

## Correlated judgment

When you judge people about something, you do not normally judge them on that thing alone. You also judge them on “correlated traits”. For example, there is this popular adage (that was popular when I was in IIT) that goes “beauty times brains equals constant”. This implies that anyone who is above average in terms of looks is likely to be below average in terms of mental capabilities. Whether such a correlation exists is not known, but by instinct if we someone beautiful, we assume that the person is not great in terms of mental ability (in my later years at IIT, we recognized this limitation of the model and proposed “beauty times brains times availability equals constant”, acknowledging that beautiful intelligent people exist, but are most likely taken).

There is no end to such correlations, which usually make rounds around college campuses. For example, there is the “he is the partying types, so is unlikely to be a good worker”. Now, while it is true that the amount of time available to most people is constant, and that heavy partying can come at the cost of working, such an adage discounts the fact that some people could simply be better time managers, or don’t care much for some axis apart from partying and working (sleeping, for example!), which allows them to be good at two things that people are normally not good at!

It is common for people to judge people. However, thanks to implicit correlations of traits that are built into people’s minds, when you get judged on one thing, that is not the only thing you are judged on – you are also judged on the things that are correlated with that!

Time for more examples. Once my parents saw a friend of mine very evidently flirting with a girl. They immediately judged him as being “a flirt” and branded him thus. While judgmental, there is no mistake in that judgment – he was indeed a flirt, and would gladly admit to it. But then my parents, using their inbuilt correlation filters, went one step ahead. “He is such a flirt”, they told me, “We don’t think he is a good person. You should not hang out with him any more”!!

Back in 2005, in IIMB, I had stood for elections to the Academic Council. At a party a week before the elections I happened to get wasted, and ended up talking to people inappropriately. The next morning, as I’m trying to get over my hangover, I heard “dude, how could you get wasted if you are standing for elections?” I have no clue how getting wasted at one party would make me a bad Academic Councillor! I must mention I lost the elections.

It was at a discussion yesterday with Bharati and my wife Priyanka that this topic of correlated judgment came up, when we were discussing how life in a business school can be unforgiving. A few minutes later, Priyanka popped up “that baby was so cute, I expected him to be dumb!”