The Old Shoe Theory of Relationships

When our daughter was young, some friends saw uncanny resemblances between her and me, and remarked that “Karthik could have married an old shoe and still produced a child that looks like this”, essentially remarking that at least as far as looks were concerned, the wife hadn’t contributed much (Bambi eyes apart).

Over time, the daughter has shown certain other traits that make her seem rather similar to me. For example, she has the practice of sticking her tongue out when performing tasks that require some degree of concentration. She laughs like me. Screeches like me. And makes a “burl-burl” noise with her fingers and lips like I do (admittedly the last one is taught). I’ve already written a fuller list of ways in which the daughter is similar to me.

If you are single and looking to get into a long-term gene propagating relationship, you inevitably ask yourself the question of whether someone is “the one” for you. We have discussed this topic multiple times on this blog.

For example, we have discussed that as far as men are concerned, one thing they look for in potential partners is “consistent fuckability“. We have also discussed that whether someone is “the one” is not a symmetric question, and when you ask yourself the question, you either get “no” or “maybe” as an answer, implying that you need to use Monte Carlo algorithms. Being married to the Marriage Broker Auntie, I’m pretty sure I’ve discussed this topic on this blog several other times.

This is a tubelight post – at least two years too late (the “old shoe” comment came that long ago), but this is yet another framework you can use to determine if you want someone as your long-term gene-propagating partner. Basically you replace yourself by an old shoe.

In other words, assume that the genes that you will propagate along with this person will result in kids who look like them, talk like them, act like them, and rather than a “next best thing”, might just be a superior version of them. Ask yourself if you are okay with having a child who is like this, and who you will be proud of.

This is another Monte Carlo type question, but if the answer in this case is no (you may not be particularly proud of a progeny who is exactly like the person under consideration – for whatever reason), you don’t want to risk propagating genes with this person. In case the answer is yes – that you are willing to parent a child who is exactly like this counterparty, then you can seriously consider this long term relationship.

Again, this applies if and only if you’re looking for a gene propagating relationship. If that isn’t an issue (no pun intended), then you don’t need to worry about old shoes of any kind.

Hypothesis Testing in Monte Carlo

I find it incredible, and not in a good way, that I took fourteen years to make the connection between two concepts I learnt barely a year apart.

In August-September 2003, I was auditing an advanced (graduate) course on Advanced Algorithms, where we learnt about randomised algorithms (I soon stopped auditing since the maths got heavy). And one important class of randomised algorithms is what is known as “Monte Carlo Algorithms”. Not to be confused with Monte Carlo Simulations, these are randomised algorithms that give a one way result. So, using the most prominent example of such an algorithm, you can ask “is this number prime?” and the answer to that can be either “maybe” or “no”.

The randomised algorithm can never conclusively answer “yes” to the primality question. If the algorithm can find a prime factor of the number, it answers “no” (this is conclusive). Otherwise it returns “maybe”. So the way you “conclude” that a number is prime is by running the test a large number of times. Each run reduces the probability that it is a “no” (since they’re all independent evaluations of “maybe”), and when the probability of “no” is low enough, you “think” it’s a “yes”. You might like this old post of mine regarding Monte Carlo algorithms in the context of romantic relationships.

Less than a year later, in July 2004, as part of a basic course in statistics, I learnt about hypothesis testing. Now (I’m kicking myself for failing to see the similarity then), the main principle of hypothesis testing is that you can never “accept a hypothesis”. You either reject a hypothesis or “fail to reject” it.  And if you fail to reject a hypothesis with a certain high probability (basically with more data, which implies more independent evaluations that don’t say “reject”), you will start thinking about “accept”.

Basically hypothesis testing is a one-sided  test, where you are trying to reject a hypothesis. And not being able to reject a hypothesis doesn’t mean we necessarily accept it – there is still the chance of going wrong if we were to accept it (this is where we get into messy territory such as p-values). And this is exactly like Monte Carlo algorithms – one-sided algorithms where we can only conclusively take a decision one way.

So I was thinking of these concepts when I came across this headline in ESPNCricinfo yesterday that said “Rahul Johri not found guilty” (not linking since Cricinfo has since changed the headline). The choice, or rather ordering, of words was interesting. “Not found guilty”, it said, rather than the usual “found not guilty”.

This is again a concept of one-sided testing. An investigation can either find someone guilty or it fails to do so, and the heading in this case suggested that the latter had happened. And as a deliberate choice, it became apparent why the headline was constructed this way – later it emerged that the decision to clear Rahul Johri of sexual harassment charges was a contentious one.

In most cases, when someone is “found not guilty” following an investigation, it usually suggests that the evidence on hand was enough to say that the chance of the person being guilty was rather low. The phrase “not found guilty”, on the other hand, says that one test failed to reject the hypothesis, but it didn’t have sufficient confidence to clear the person of guilt.

So due credit to the Cricinfo copywriters, and due debit to the product managers for later changing the headline rather than putting a fresh follow-up piece.

PS: The discussion following my tweet on the topic threw up one very interesting insight – such as Scotland having had a “not proven” verdict in the past for such cases (you can trust DD for coming up with such gems).

Bankers predicting football

So the Football World Cup season is upon us, and this means that investment banking analysts are again engaging in the pointless exercise of trying to predict who will win the World Cup. And the funny thing this time is that thanks to MiFiD 2 regulations, which prevent banking analysts from giving out reports for free, these reports aren’t in the public domain.

That means we’ve to rely on media reports of these reports, or on people tweeting insights from them. For example, the New York Times has summarised the banks’ predictions on the winner. And this scatter plot from Goldman Sachs will go straight into my next presentation on spurious correlations:

Different banks have taken different approaches to predict who will win the tournament. UBS has still gone for a classic Monte Carlo simulation  approach, but Goldman Sachs has gone one ahead and used “four different methods in artificial intelligence” to predict (for the third consecutive time) that Brazil will win the tournament.

In fact, Goldman also uses a Monte Carlo simulation, as Business Insider reports.

The firm used machine learning to run 200,000 models, mining data on team and individual player attributes, to help forecast specific match scores. Goldman then simulated 1 million possible variations of the tournament in order to calculate the probability of advancement for each squad.

But an insider in Goldman with access to the report tells me that they don’t use the phrase itself in the report. Maybe it’s a suggestion that “data scientists” have taken over the investment research division at the expense of quants.

I’m also surprised with the reporting on Goldman’s predictions. Everyone simply reports that “Goldman predicts that Brazil will win”, but surely (based on the model they’ve used), that prediction has been made with a certain probability? A better way of reporting would’ve been to say “Goldman predicts Brazil most likely to win, with X% probability” (and the bank’s bets desk in the UK could have placed some money on it).

ING went rather simple with their forecasts – simply took players’ transfer values, and summed them up by teams, and concluded that Spain is most likely to win because their squad is the “most valued”. Now, I have two major questions about this approach – firstly, it ignores the “correlation term” (remember the famous England conundrum of the noughties of fitting  Gerrard and Lampard into the same eleven?), and assumes a set of strong players is a strong team. Secondly, have they accounted for inflation? And if so, how have they accounted for inflation? Player valuation (about which I have a chapter in my book) has simply gone through the roof in the last year, with Mo Salah at £35 million being considered a “bargain buy”.

Nomura also seems to have taken a similar approach, though they have in some ways accounted for the correlation term by including “team momentum” as a factor!

Anyway, I look forward to the football! That it is live on BBC and ITV means I get to watch the tournament from the comfort of my home (a luxury in England!). Also being in England means all matches are at a sane time, so I can watch more of this World Cup than the last one.

 

Matt Levine describes my business idea

When I was leaving the big bank I was working for (I keep forgetting whether this blog is anonymous or not, but considering that I’ve now mentioned it on my LinkedIn profile (and had people congratulate me “on the new job”), I suppose it’s not anonymous any more) in 2011, I didn’t bother looking for a new job.

I was going into business, I declared. The philosophy (that’s a word I’ve learnt to use in this context by talking to Venture Capitalists) was that while Quant in investment banking was already fairly saturated, there was virgin territory in other industries, and I’d use my bank-honed quant skills to improve the level of reasoning in these other industries.

Since then things have more or less gone well. I’ve worked in several sectors, and done a lot of interesting work. While a lot of it has been fairly challenging, very little of it has technically been of a level that would be considered challenging by an investment banking quant. And all this is by design.

I’ve long admired Matt Levine for the way in which he clearly explains fairly complicated finance stuff in his daily newsletter (that you can get delivered to your inbox for free),  and more or less talking about finance in an entertaining model. I’ve sometimes mentioned that I’ve wanted to grow up to be like him, to write like him, to analyse like him and all that.

And I find that in yesterday’s newsletter he clearly encapsulates the idea with which I started off when I quit banking in 2011. He writes:

A good trick is, find an industry where the words “Monte Carlo model” make you sound brilliant and mysterious, then go to town.

This is exactly what I set out to do in 2011, and have continued to do since then. And you’d be amazed to find the number of industries where “Monte Carlo model” makes you sound brilliant and mysterious.

Considering the difficulties I’ve occasionally had in communicating to people what exactly I do, I think I should adopt Levine’s line to describe my work. I clearly can’t go wrong that way.

 

Wasps have thin tails, or why cricket prediction algorithms fail..

A couple of months back, i was watching a One Day International match between New Zealand and India. it was during this series that the WASP algorithm for predicting score at the end of the innings (for the team batting first) and chances of victory (for the team batting second) first came in to public prominence. During the games, the scorecard would include the above data, which was derived from the WASP algorithm.

This one game that I was watching (I think it was the fourth match of the series), New Zealand was chasing. They were fifty odd without loss, and WASP showed their chances of winning as somewhere around 60%. Then Jesse Ryder got out. Suddenly, the chances of winning, as shown on screen dropped to the forties! How can the fall of one wicket, and at an early stage in the game, influence a game so much? The problem is with the algorithm.

Last year, during the IPL, i tried running this graphic that I called the “Tug-of-War”, that was to depict how the game swung between the two teams. Taking the analogy forward, if you were to imagine the game of cricket as a game of Tug-of-War, the graph plotted the position of the middle of the rope as a function of time. Here is a sample graphic from last year:

baseplot

 

This shows how the game between the Pune Warriors and Sun Risers “moved”. The upper end of the graph represents the Sun Risers’ “line” and the lower end the line of the Pune Warriors. So we can see from this graph that for the first half of the Sun Risers innings (they batted first), the Sun Risers were marginally ahead. Then Pune pulled it back in the second half, and then somewhere midway through the Pune Innings, the SunRisers pulled it back again, and eventually won the game.

At least that’s the intention with which I started putting out this graphic. In practice, you can see that there is a problem. Check out the graph somewhere around the 8th over of the Pune innings. This was when Marlon Samuels got out. How can one event change the course of the game so dramatically? It was similar to the movement in the WASP when Ryder got out in the recent NZ-India match.

So what is the problem here? Based on the WASP algorithm that the designers have kindly published, and the algorithm I used for last year’s IPL (which was Monte Carlo-based), the one thing common is that both algorithms are Markovian (I know mine is, and from what WASP has put out, I’m guessing theirs is, too). To explain in English, what our algorithms assume is that what happens in the next ball doesn’t depend on what has happened so far. The odds of different events on the next ball (dot, six, single, out, etc.) are independent of how the previous balls have shaped up – this is the assumption that our algorithms use. And since that doesn’t accurately represent what happens in a cricket match, we end up with “thin tails”.

Recently, to evaluate IPL matches, with a view of evaluating players ahead of the auction, I reverse engineered the WASP algorithm, and decided to see what it says about the score at the end of an ODI innings. Note that my version is team agnostic, and assumes that every ball is bowled by “the average bowler” to “the average batsman”. The distribution of team score at the end of the first innings, as calculated by my algorithm, can be seen in the blue line in the graph below. The red line shows the actual distribution of score at the end of an ODI innings in the last 5 years (same data that’s been used to construct the model).

wasptails

Note how the blue curve has a much higher peak, and tails off very quickly on either side. In other words, a lot of “mass” is situated within a small range of scores, and this leads to the bizarre situations as you can see in the first graph, and what I saw in the New Zealand India game.

The problem with a dynamic programming based approach, such as WASP, is that you need to make a Markovian assumption, and that assumption results in thin tails. And when you are trying to predict the probability of victory, and are using a curve such as the blue one above as your expected distribution of score at the end of the innings, events such as a six or a wicket can drastically alter your calculated odds.

To improve the cricket prediction system, what we need is an algorithm that can replicate the “fat tails” that the actual distribution of cricket scores shows. My current Monte Carlo based algorithm doesn’t cut it. Neither does the WASP.