Religion and survivorship bias

Biju Dominic of FinalMile Consulting has a piece in Mint about “what CEOs can learn from religion“. In that, he says,

Despite all the hype, the vast majority of these so-called highly successful, worthy of being emulated companies, do not survive even for few decades. On the other hand, religion, with all its inadequacies, continues to survive after thousands of years.

This is a fallacious comparison.

Firstly, comparing “religion” to a particular company isn’t dimensionally consistent. A better comparison would be to compare at the conceptual level – such as comparing “religion” to “joint stock company”. And like the former, the latter has done rather well for 300 years now, even if specific companies may fold up after a few years.

The other way to make an apples-to-apples comparison is to compare a particular company to a particular religion. And this is where survivorship bias comes in.

Most of the dominant religions of today are more than hundreds or thousands of years old. In the course of their journey to present-day strength, they have first established their own base and then fought off competition from other upstart religions.

In other words, when Dominic talks about “religion” he is only taking into account religions that have displayed memetic fitness over a really long period. What he fails to take account of are the thousands of startup religions that get started up once every few years and then fade into nothingness.

Historically, such religions haven’t been well documented, but that doesn’t mean they didn’t exist. In contemporary times, one can only look at the thousands of “babas” with cults all around India – each is leading his/her own “startup religion”, and most of them are likely to sink without a trace.

Comparing the best in one class (religions that have survived and thrived over thousands of years) to the average of another class (the average corporation) just doesn’t make sense!


Astrology and Data Science

The discussion goes back some 6 years, when I’d first started setting up my data and management consultancy practice. Since I’d freshly quit my job to set up the said practice, I had plenty of time on my hands, and the wife suggested that I spend some of that time learning astrology.

Considering that I’ve never been remotely religious or superstitious, I found this suggestion preposterous (I had a funny upbringing in the matter of religion – my mother was insanely religious (including following a certain Baba), and my father was insanely rationalist, and I kept getting pulled in both directions).

Now, the wife has some (indirect) background in astrology. One of her aunts is an astrologer, and specialises in something called “prashNa shaastra“, where the prediction is made based on the time at which the client asks the astrologer a question. My wife believes this has resulted in largely correct predictions (though I suspect a strong dose of confirmation bias there), and (very strangely to me) seems to believe in the stuff.

“What’s the use of studying astrology if I don’t believe in it one bit”, I asked. “Astrology is very mathematical, and you are very good at mathematics. So you’ll enjoy it a lot”, she countered, sidestepping the question.

We went off into a long discussion on the origins of astrology, and how it resulted in early developments in astronomy (necessary in order to precisely determine the position of planets), and so on. The discussion got involved, and involved many digressions, as discussions of this sort might entail. And as you might expect with such discussions, my wife threw a curveball, “You know, you say you’re building a business based on data analysis. Isn’t data analysis just like astrology?”

I was stumped (ok I know I’m mixing metaphors here), and that had ended the discussion then.

Until I decided to bring it up recently. As it turns out, once again (after a brief hiatus when I decided I’ll do a job) I’m in process of setting up a data and management consulting business. The difference is this time I’m in London, and that “data science” is a thing (it wasn’t in 2011). And over the last year or so I’ve been kinda disappointed to see what goes on in the name of “data science” around me.

This XKCD cartoon (which I’ve shared here several times) encapsulates it very well. People literally “pour data into a machine learning system” and then “stir the pile” hoping for the results.


In the process of applying fairly complex “machine learning” algorithms, I’ve seen people not really bother about whether the analysis makes intuitive sense, or if there is “physical meaning” in what the analysis says, or if the correlations actually determine causation. It’s blind application of “run the data through a bunch of scikit learn models and accept the output”.

And this is exactly how astrology works. There are a bunch of predictor variables (position of different “planets” in various parts of the “sky”). There is the observed variable (whether some disaster happened or not, basically), which is nicely in binary format. And then some of our ancients did some data analysis on this, trying to identify combinations of predictors that predicted the output (unfortunately they didn’t have the power of statistics or computers, so in that sense the models were limited). And then they simply accepted the outputs, without challenging why it makes sense that the position of Jupiter at the time of wedding affects how your marriage will go.

So I brought up the topic of astrology and data science again recently, saying “OK after careful analysis I admit that astrology is the oldest form of data science”. “That’s not what I said”, the wife countered. “I said that data science is new age astrology, and not the other way round”.

It’s hard to argue with that!

Biases, statistics and luck

Tomorrow Liverpool plays Manchester City in the Premier League. As things stand now I don’t plan to watch this game. This entire season so far, I’ve only watched two games. First, I’d gone to a local pub to watch Liverpool’s visit to Manchester City, back in September. Liverpool got thrashed 5-0.

Then in October, I went to Wembley to watch Tottenham Hotspur play Liverpool. The Spurs won 4-1. These two remain Liverpool’s only defeats of the season.

I might consider myself to be a mostly rational person but I sometimes do fall for the correlation-implies-causation bias, and think that my watching those games had something to do with Liverpool’s losses in them. Never mind that these were away games played against other top sides which attack aggressively. And so I have this irrational “fear” that if I watch tomorrow’s game (even if it’s from a pub), it might lead to a heavy Liverpool defeat.

And so I told Baada, a Manchester City fan, that I’m not planning to watch tomorrow’s game. And he got back to me with some statistics, which he’d heard from a podcast. Apparently it’s been 80 years since Manchester City did the league “double” (winning both home and away games) over Liverpool. And that it’s been 15 years since they’ve won at Anfield. So, he suggested, there’s a good chance that tomorrow’s game won’t result in a mauling for Liverpool, even if I were to watch it.

With the easy availability of statistics, it has become a thing among football commentators to supply them during the commentary. And from first hearing, things like “never done this in 80 years” or “never done that for last 15 years” sounds compelling, and you’re inclined to believe that there is something to these numbers.

I don’t remember if it was Navjot Sidhu who said that statistics are like a bikini (“what they reveal is significant but what they hide is crucial” or something). That Manchester City hasn’t done a double over Liverpool in 80 years doesn’t mean a thing, nor does it say anything that they haven’t won at Anfield in 15 years.

Basically, until the mid 2000s, City were a middling team. I remember telling Baada after the 2007 season (when Stuart Pearce got fired as City manager) that they’d be surely relegated next season. And then came the investment from Thaksin Shinawatra. And the appointment of Sven-Goran Eriksson as manager. And then the youtube signings. And later the investment from the Abu Dhabi investment group. And in 2016 the appointment of Pep Guardiola as manager. And the significant investment in players after that.

In other words, Manchester City of today is a completely different team from what they were even 2-3 years back. And they’re surely a vastly improved team compared to a decade ago. I know Baada has been following them for over 15 years now, but they’re unrecognisable from the time he started following them!

Yes, even with City being a much improved team, Liverpool have never lost to them at home in the last few years – but then Liverpool have generally been a strong team playing at home in these years! On the other hand, City’s 18-game winning streak (which included wins at Chelsea and Manchester United) only came to an end (with a draw against Crystal Palace) rather recently.

So anyways, here are the takeaways:

  1. Whether I watch the game or not has no bearing on how well Liverpool will play. The instances from this season so far are based on 1. small samples and 2. biased samples (since I’ve chosen to watch Liverpool’s two toughest games of the season)
  2. 80-year history of a fixture has no bearing since teams have evolved significantly in these 80 years. So saying a record stands so long has no meaning or predictive power for tomorrow’s game.
  3. City have been in tremendous form this season, and Liverpool have just lost their key player (by selling Philippe Coutinho to Barcelona), so City can fancy their chances. That said, Anfield has been a fortress this season, so Liverpool might just hold (or even win it).

All of this points to a good game tomorrow! Maybe I should just watch it!



Lessons from poker party

In the past I’ve drawn lessons from contract bridge on this blog – notably, I’d described a strategy called “queen of hearts” in order to maximise chances of winning in a game that is terribly uncertain. Now it’s been years since I played bridge, or any card game for that matter. So when I got invited for a poker party over the weekend, I jumped at the invitation.

This was only the second time ever that I’d played poker in a room – I’ve mostly played online where there are no monetary stakes and you see people go all in on every hand with weak cards. And it was a large table, with at least 10 players being involved in each hand.

A couple of pertinent observations (reasonable return for the £10 I lost that night).

Firstly a windfall can make you complacent. I’m usually a conservative player, bidding aggressively only when I know that I have good chances of winning. I haven’t played enough to have mugged up all the probabilities – that probably offers an edge to my opponents. But I have a reasonable idea of what constitutes a good hand and bid accordingly.

My big drawdown happened in the hand immediately after I’d won big. After an hour or so of bleeding money, I’d suddenly more than broken even. That meant that in my next hand, I bid a bit more aggressively than I would have for what I had. For a while I managed to stay rational (after the flop I knew I had a 1/6 chance of winning big, and having mugged up the Kelly Criterion on my way to the party, bid accordingly).

And when the turn wasn’t to my liking I should’ve just gotten out – the (approx) percentages didn’t make sense any more. But I simply kept at it, falling for the sunk cost fallacy (what I’d put in thus far in the hand). I lost some 30 chips in that one hand, of which at least 21 came at the turn and the river. Without the high of having won the previous hand, I would’ve played more rationally and lost only 9. After all the lectures I’ve given on logic, correlation-causation and the sunk cost fallacy, I’m sad I lost so badly because of the last one.

The second big insight is that poverty leads to suboptimal decisions. Now, this is a well-studied topic in economics but I got to experience it first hand during the session. This was later on in the night, as I was bleeding money (and was down to about 20 chips).

I got pocket aces (a pair of aces in hand) – something I should’ve bid aggressively with. But with the first 3 open cards falling far away from the face cards and being uncorrelated, I wasn’t sure of the total strength of my hand (mugging up probabilities would’ve helped for sure!). So when I had to put in 10 chips to stay in the hand, I baulked, and folded.

Given the play on the table thus far, it was definitely a risk worth taking, and with more in the bank, I would have. But poverty and the Kelly Criterion meant that the number of chips that I was able to invest in the arguably strong hand was limited, and that limited my opportunity to profit from the game.

It is no surprise that the rest of the night petered out for me as my funds dwindled and my ability to play diminished. Maybe I should’ve bought in more when I was down to 20 chips – but then given my ability relative to the rest of the table, that would’ve been good money after bad.

Scott Adams, careers and correlation

I’ve written here earlier about how much I’ve been influenced by Scott Adams’s career advice about “being in top quartile of two or more things“.  To recap, this is what Adams wrote nearly ten years back:

If you want an average successful life, it doesn’t take much planning. Just stay out of trouble, go to school, and apply for jobs you might like. But if you want something extraordinary, you have two paths:

1. Become the best at one specific thing.
2. Become very good (top 25%) at two or more things.

The first strategy is difficult to the point of near impossibility. Few people will ever play in the NBA or make a platinum album. I don’t recommend anyone even try.

Having implemented this to various degrees of success over the last 5-6 years, I propose a small correction – basically to follow the second strategy that Adams has mentioned, you need to take correlation into account.

Basically there’s no joy in becoming very good (top 25%) at two or more correlated things. For example, if you think you’re in the top 25% in terms of “maths and physics” or “maths and computer science” there’s not so much joy because these are correlated skills. Lots of people who are very good at maths are also very good at physics or computer science. So there is nothing special in being very good at such a combination.

Why Adams succeeded was that he was very good at 2-3 things that are largely uncorrelated – drawing, telling jokes and understanding corporate politics are not very correlated to each other. So the combination of these three skills of his was rather unique to find, and their combination resulted in the wildly successful Dilbert.

So the key is this – in order to be wildly successful, you need to be very good (top 25%) at two or three things that are not positively correlated with each other (either orthogonal or negative correlation works). That ensures that if you can put them together, you can offer something that very few others can offer.

Then again, the problem there is that the market for this combination of skills will be highly illiquid – low supply means people who might demand such combinations would have adapted to make do with some easier to find substitute, so demand is lower, and so on. So in that sense, again, it’s a massive hit-or-miss!

When I missed my moment in the sun

Going through an old piece I’d written for Mint, while conducting research for something I’m planning to write, I realise that I’d come rather close to staking claim as a great election forecaster. As it happened, I just didn’t have the balls to stick my neck out (yes, mixed metaphors and all that) and so I missed the chance to be a hero.

I was writing a piece on election forecasting, and the art of converting vote shares into seat shares, which is tricky business in a first past the post system such as India. I was trying to explain how the number of “corners of contests” can have an impact on what seat share a particular vote share can translate to, and I wrote about Uttar Pradesh.

Quoting from my article:

An opinion poll conducted by CNN-IBN and CSDS whose results were published last week predicted that in Uttar Pradesh, the Bharatiya Janata Party is likely to get 38% of the vote. The survey reported that this will translate to about 41-49 seats for the BJP. What does our model above say?

If you look at the graph for the four-cornered contest closely (figure 4), you will notice that 38% vote share literally falls off the chart. Only once before has a party secured over 30% of the vote in a four-cornered contest (Congress in relatively tiny Haryana in 2004, with 42%) and on that occasion went on to get 90% of the seats (nine out of 10).

Given that this number (38%) falls outside the range we have noticed historically for a four-cornered contest, it makes it unpredictable. What we can say, however, is that if a party can manage to get 38% of the votes in a four-cornered state such as Uttar Pradesh, it will go on to win a lot of seats.

As it turned out, the BJP did win nearly 90% of all seats in the state (71 out of 80 to be precise), stumping most election forecasters. As you can see, I had it all right there, except that I didn’t put it in that many words – I chickened out by saying “a lot of seats”. And so I’m still known as “the guy who writes on election data for Mint” rather than “that great election forecaster”.

Then again, you don’t want to be too visible with the predictions you make, and India’s second largest business newspaper is definitely not an “obscure place”. As I’d written a long time back regarding financial forecasts,

…take your outrageous prediction and outrageous reasons and publish a paper. It should ideally be in a mid-table journal – the top journals will never accept anything this outrageous, and you won’t want too much footage for it also.

In all probability your prediction won’t come true. Remember – it was outrageous. No harm with that. Just burn that journal in your safe (I mean take it out of the safe before you burn it). There is a small chance of your prediction coming true. In all likelihood it wont, but just in case it does, pull that journal out of that safe and call in your journalist friends. You will be the toast of the international press.

So maybe choosing to not take the risk with my forecast was a rational decision after all. Just that it doesn’t appear so in hindsight.

Damming the Nile and diapers

One of the greatest engineering problems in the last century was to determine the patterns in the flow of the Nile. It had been clear for at least a couple of millennia that the flow of the river was not regular, and the annual flow did not follow something like a normal distribution.

The matter gained importance in the late 1800s when the British colonial government decided to dam the Nile. Understanding accurately the pattern of flows of the river was important to determine the capacity of the reservoir being built, so that both floods and droughts could be contained.

The problem was solved by Harold Edwin Hurst, a British hydrologist who was posted in Egypt for over 60 years in the 20th Century. Hurst defined his model as one of “long-range dependence”, and managed to accurately predict the variation in the flow of the river. In recognition of his services, Egyptians gave him the moniker “Abu Nil” (father of the Nile). Later on, Benoit Mandelbrot named a quantity that determines the long-range dependence of a time series after Hurst.

I’ve written about Hurst once before, in the context of financial markets, but I invoke him here with respect to a problem closer to me – the pattern of my daughter’s poop.

It is rather well known that poop, even among babies, is not a continuous process. If someone were to poop 100ml of poop a day (easier to use volume rather than weight in the context of babies), it doesn’t mean they poop 4ml every hour. Poop happens in discrete bursts, and the number of such bursts per day depends upon age, decreasing over time into adulthood.

One might think that a reasonable way to model poop is to assume that the amount of poop in each burst follows a normal distribution, and each burst is independent of the ones around it. However, based on a little over two months’ experience of changing my daughter’s diapers, I declare this kind of a model to be wholly inaccurate.

For, what I’ve determined is that far from being normal, pooping patterns follow long-range dependence. There are long time periods (spanning a few diaper changes) when there is no, or very little, poop. Then there are times when it flows at such a high rate that we need to change diapers at a far higher frequency than normal. And such periods are usually followed by other high-poop periods. And so on.

In other words, the amount of poop has positive serial correlation. And to use the index that Mandelbrot lovingly constructed and named in honour of Hurst, the Hurst exponent of my daughter’s (and other babies’) poop is much higher than 0.5.

This makes me wonder if diaper manufacturers have taken this long-range dependence into account while determining diaper capacity. Or I wonder if, instead, they simply assume that parents will take care of this by adjusting the inter-diaper-change time period.

As Mandelbrot describes towards the end of his excellent Misbehaviour of markets , you can  use so-called “multifractal models” which combine normal price increments with irregular time increments to get an accurate (fractal) representation of the movement of stock prices.

PS: Apologies to those who got disgusted by the post. Until a massive burst a few minutes ago I’d never imagined I’d be comparing the flows of poop and the Nile!