What Ails Liverpool

So Liverpool FC has had a mixed season so far. They’re second in the Premier League with 36 points from 14 games (only points dropped being draws against ManCity, Chelsea and Arsenal), but are on the verge of going out of the Champions League, having lost all three away games.

Yesterday’s win over Everton was damn lucky, down to a 96th minute freak goal scored by Divock Origi (I’d forgotten he’s still at the club). Last weekend’s 3-0 against Watford wasn’t as comfortable as the scoreline suggested, the scoreline having been opened only midway through the second half. The 2-0 against Fulham before that was similarly a close-fought game.

Of concern to most Liverpool fans has been the form of the starting front three – Mo Salah, Roberto Firmino and Sadio Mane. The trio has missed a host of chances this season, and the team has looked incredibly ineffective in the away losses in the Champions League (the only shot on target in the 2-1 loss against PSG being the penalty that was scored by Milner).

There are positives, of course. The defence has been tightened considerably compared to last season. Liverpool aren’t leaking goals the way they did last season. There have been quite a few clean sheets so far this season. So far there has been no repeat of last season’s situation where they went 4-1 up against ManCity, only to quickly let in two goals and then set up a tense finish.

So my theory is this – each of the front three of Liverpool has an incredibly low strike rate. I don’t know if the xG stat captures this, but the number of chances required by each of Mane, Salah and Firmino before they can convert is rather low. If the average striker converts one in two chances, all of these guys convert one in four (these numbers are pulled out of thin air. I haven’t looked at the statistics).

And even during the “glory days” of last season when Liverpool was scoring like crazy, this low strike rate remained. Instead, what helped then was a massive increase in the number of chances created. The one game I watched live (against Spurs at Wembley), what struck me was the number of chances Salah kept missing. But as the chances kept getting created, he ultimately scored one (Liverpool lost 4-1).

What I suspect is that as Klopp decided to tighten things up at the back this season, the number of chances being created has dropped. And with the low strike rate of each of the front three, this lower number of chances translates into much lower number of goals being scored. If we want last season’s scoring rate, we might also have to accept last season’s concession rate (though this season’s goalie is much much better).

There ain’t no such thing as a free lunch.

Randomness and sample size

I have had a strange relationship with volleyball, as I’ve documented here. Unlike in most other sports I’ve played, I was a rather defensive volleyball player, excelling in backline defence, setting and blocking, rather than spiking.

The one aspect of my game which was out of line with the rest of my volleyball, but in line with my play in most other sports I’ve played competitively, was my serve. I had a big booming serve, which at school level was mostly unreturnable.

The downside of having an unreturnable serve, though, is that you are likely to miss your serve more often than the rest – it might mean hitting it too long, or into the net, or wide. And like in one of the examples I’ve quoted in my earlier post, it might mean not getting a chance to serve at all, as the warm up serve gets returned or goes into the net.

So I was discussing my volleyball non-career with a friend who is now heavily involved in the game, and he thought that I had possibly been extremely unlucky. My own take on this is that given how little I played, it’s quite likely that things would have gone spectacularly wrong.

Changing domains a little bit, there was a time when I was building strategies for algorithmic trading, in a class known as “statistical arbitrage”. The deal there is that you have a small “edge” on each trade, but if you do a large enough number of trades, you will make money. As it happened, the guy I was working for then got spooked out after the first couple of trades went bad and shut down the strategy at a heavy loss.

Changing domains a little less this time, this is also the reason why you shouldn’t check your portfolio too often if you’re investing for the long term – in the short run, when there have been “fewer plays”, the chances of having a negative return are higher even if you’re in a mostly safe strategy, as I had illustrated in this blog post in 2008 (using the Livejournal URL since the table didn’t port well to wordpress).

And changing domains once again, the sheer number of “samples” is possibly one reason that the whole idea of quantification of sport and “SABRmetrics” first took hold in baseball. The Major League Baseball season is typically 162 games long (and this is before the playoffs), which means that any small edge will translate into results in the course of the league. A smaller league would mean fewer games and thus more randomness, and a higher chance that a “better play” wouldn’t work out.

This also explains why when “Moneyball” took off with the Oakland A’s in the 1990s, they focussed mainly on league performance and not performance in the playoffs – in the latter, there are simply not enough “samples” for a marginal advantage in team strength to necessarily have the impact in terms of results.

And this is the problem with newly appointed managers of elite football clubs in Europe “targeting the Champions League” – a knockout tournament of that format means that the best team need not always win. Targeting a national league, played out over at least 34 games in the season is a much better bet.

Finally, there is also the issue of variance. A higher variance in performance means that observations of a few instances of bad performance is not sufficient to conclude that the player is a bad performer – a great performance need not be too far away. For a player with less randomness in performance – a more steady player, if you will – a few bad performances will tell you that they are unlikely to come good. High risk high return players, on the other hand, need to be given a longer rope.

I’d put this in a different way in a blog a few years back, about Mitchell Johnson.

Religion and survivorship bias

Biju Dominic of FinalMile Consulting has a piece in Mint about “what CEOs can learn from religion“. In that, he says,

Despite all the hype, the vast majority of these so-called highly successful, worthy of being emulated companies, do not survive even for few decades. On the other hand, religion, with all its inadequacies, continues to survive after thousands of years.

This is a fallacious comparison.

Firstly, comparing “religion” to a particular company isn’t dimensionally consistent. A better comparison would be to compare at the conceptual level – such as comparing “religion” to “joint stock company”. And like the former, the latter has done rather well for 300 years now, even if specific companies may fold up after a few years.

The other way to make an apples-to-apples comparison is to compare a particular company to a particular religion. And this is where survivorship bias comes in.

Most of the dominant religions of today are more than hundreds or thousands of years old. In the course of their journey to present-day strength, they have first established their own base and then fought off competition from other upstart religions.

In other words, when Dominic talks about “religion” he is only taking into account religions that have displayed memetic fitness over a really long period. What he fails to take account of are the thousands of startup religions that get started up once every few years and then fade into nothingness.

Historically, such religions haven’t been well documented, but that doesn’t mean they didn’t exist. In contemporary times, one can only look at the thousands of “babas” with cults all around India – each is leading his/her own “startup religion”, and most of them are likely to sink without a trace.

Comparing the best in one class (religions that have survived and thrived over thousands of years) to the average of another class (the average corporation) just doesn’t make sense!

 

Astrology and Data Science

The discussion goes back some 6 years, when I’d first started setting up my data and management consultancy practice. Since I’d freshly quit my job to set up the said practice, I had plenty of time on my hands, and the wife suggested that I spend some of that time learning astrology.

Considering that I’ve never been remotely religious or superstitious, I found this suggestion preposterous (I had a funny upbringing in the matter of religion – my mother was insanely religious (including following a certain Baba), and my father was insanely rationalist, and I kept getting pulled in both directions).

Now, the wife has some (indirect) background in astrology. One of her aunts is an astrologer, and specialises in something called “prashNa shaastra“, where the prediction is made based on the time at which the client asks the astrologer a question. My wife believes this has resulted in largely correct predictions (though I suspect a strong dose of confirmation bias there), and (very strangely to me) seems to believe in the stuff.

“What’s the use of studying astrology if I don’t believe in it one bit”, I asked. “Astrology is very mathematical, and you are very good at mathematics. So you’ll enjoy it a lot”, she countered, sidestepping the question.

We went off into a long discussion on the origins of astrology, and how it resulted in early developments in astronomy (necessary in order to precisely determine the position of planets), and so on. The discussion got involved, and involved many digressions, as discussions of this sort might entail. And as you might expect with such discussions, my wife threw a curveball, “You know, you say you’re building a business based on data analysis. Isn’t data analysis just like astrology?”

I was stumped (ok I know I’m mixing metaphors here), and that had ended the discussion then.

Until I decided to bring it up recently. As it turns out, once again (after a brief hiatus when I decided I’ll do a job) I’m in process of setting up a data and management consulting business. The difference is this time I’m in London, and that “data science” is a thing (it wasn’t in 2011). And over the last year or so I’ve been kinda disappointed to see what goes on in the name of “data science” around me.

This XKCD cartoon (which I’ve shared here several times) encapsulates it very well. People literally “pour data into a machine learning system” and then “stir the pile” hoping for the results.

Source: https://xkcd.com/1838/

In the process of applying fairly complex “machine learning” algorithms, I’ve seen people not really bother about whether the analysis makes intuitive sense, or if there is “physical meaning” in what the analysis says, or if the correlations actually determine causation. It’s blind application of “run the data through a bunch of scikit learn models and accept the output”.

And this is exactly how astrology works. There are a bunch of predictor variables (position of different “planets” in various parts of the “sky”). There is the observed variable (whether some disaster happened or not, basically), which is nicely in binary format. And then some of our ancients did some data analysis on this, trying to identify combinations of predictors that predicted the output (unfortunately they didn’t have the power of statistics or computers, so in that sense the models were limited). And then they simply accepted the outputs, without challenging why it makes sense that the position of Jupiter at the time of wedding affects how your marriage will go.

So I brought up the topic of astrology and data science again recently, saying “OK after careful analysis I admit that astrology is the oldest form of data science”. “That’s not what I said”, the wife countered. “I said that data science is new age astrology, and not the other way round”.

It’s hard to argue with that!

Biases, statistics and luck

Tomorrow Liverpool plays Manchester City in the Premier League. As things stand now I don’t plan to watch this game. This entire season so far, I’ve only watched two games. First, I’d gone to a local pub to watch Liverpool’s visit to Manchester City, back in September. Liverpool got thrashed 5-0.

Then in October, I went to Wembley to watch Tottenham Hotspur play Liverpool. The Spurs won 4-1. These two remain Liverpool’s only defeats of the season.

I might consider myself to be a mostly rational person but I sometimes do fall for the correlation-implies-causation bias, and think that my watching those games had something to do with Liverpool’s losses in them. Never mind that these were away games played against other top sides which attack aggressively. And so I have this irrational “fear” that if I watch tomorrow’s game (even if it’s from a pub), it might lead to a heavy Liverpool defeat.

And so I told Baada, a Manchester City fan, that I’m not planning to watch tomorrow’s game. And he got back to me with some statistics, which he’d heard from a podcast. Apparently it’s been 80 years since Manchester City did the league “double” (winning both home and away games) over Liverpool. And that it’s been 15 years since they’ve won at Anfield. So, he suggested, there’s a good chance that tomorrow’s game won’t result in a mauling for Liverpool, even if I were to watch it.

With the easy availability of statistics, it has become a thing among football commentators to supply them during the commentary. And from first hearing, things like “never done this in 80 years” or “never done that for last 15 years” sounds compelling, and you’re inclined to believe that there is something to these numbers.

I don’t remember if it was Navjot Sidhu who said that statistics are like a bikini (“what they reveal is significant but what they hide is crucial” or something). That Manchester City hasn’t done a double over Liverpool in 80 years doesn’t mean a thing, nor does it say anything that they haven’t won at Anfield in 15 years.

Basically, until the mid 2000s, City were a middling team. I remember telling Baada after the 2007 season (when Stuart Pearce got fired as City manager) that they’d be surely relegated next season. And then came the investment from Thaksin Shinawatra. And the appointment of Sven-Goran Eriksson as manager. And then the youtube signings. And later the investment from the Abu Dhabi investment group. And in 2016 the appointment of Pep Guardiola as manager. And the significant investment in players after that.

In other words, Manchester City of today is a completely different team from what they were even 2-3 years back. And they’re surely a vastly improved team compared to a decade ago. I know Baada has been following them for over 15 years now, but they’re unrecognisable from the time he started following them!

Yes, even with City being a much improved team, Liverpool have never lost to them at home in the last few years – but then Liverpool have generally been a strong team playing at home in these years! On the other hand, City’s 18-game winning streak (which included wins at Chelsea and Manchester United) only came to an end (with a draw against Crystal Palace) rather recently.

So anyways, here are the takeaways:

  1. Whether I watch the game or not has no bearing on how well Liverpool will play. The instances from this season so far are based on 1. small samples and 2. biased samples (since I’ve chosen to watch Liverpool’s two toughest games of the season)
  2. 80-year history of a fixture has no bearing since teams have evolved significantly in these 80 years. So saying a record stands so long has no meaning or predictive power for tomorrow’s game.
  3. City have been in tremendous form this season, and Liverpool have just lost their key player (by selling Philippe Coutinho to Barcelona), so City can fancy their chances. That said, Anfield has been a fortress this season, so Liverpool might just hold (or even win it).

All of this points to a good game tomorrow! Maybe I should just watch it!

 

 

Lessons from poker party

In the past I’ve drawn lessons from contract bridge on this blog – notably, I’d described a strategy called “queen of hearts” in order to maximise chances of winning in a game that is terribly uncertain. Now it’s been years since I played bridge, or any card game for that matter. So when I got invited for a poker party over the weekend, I jumped at the invitation.

This was only the second time ever that I’d played poker in a room – I’ve mostly played online where there are no monetary stakes and you see people go all in on every hand with weak cards. And it was a large table, with at least 10 players being involved in each hand.

A couple of pertinent observations (reasonable return for the £10 I lost that night).

Firstly a windfall can make you complacent. I’m usually a conservative player, bidding aggressively only when I know that I have good chances of winning. I haven’t played enough to have mugged up all the probabilities – that probably offers an edge to my opponents. But I have a reasonable idea of what constitutes a good hand and bid accordingly.

My big drawdown happened in the hand immediately after I’d won big. After an hour or so of bleeding money, I’d suddenly more than broken even. That meant that in my next hand, I bid a bit more aggressively than I would have for what I had. For a while I managed to stay rational (after the flop I knew I had a 1/6 chance of winning big, and having mugged up the Kelly Criterion on my way to the party, bid accordingly).

And when the turn wasn’t to my liking I should’ve just gotten out – the (approx) percentages didn’t make sense any more. But I simply kept at it, falling for the sunk cost fallacy (what I’d put in thus far in the hand). I lost some 30 chips in that one hand, of which at least 21 came at the turn and the river. Without the high of having won the previous hand, I would’ve played more rationally and lost only 9. After all the lectures I’ve given on logic, correlation-causation and the sunk cost fallacy, I’m sad I lost so badly because of the last one.

The second big insight is that poverty leads to suboptimal decisions. Now, this is a well-studied topic in economics but I got to experience it first hand during the session. This was later on in the night, as I was bleeding money (and was down to about 20 chips).

I got pocket aces (a pair of aces in hand) – something I should’ve bid aggressively with. But with the first 3 open cards falling far away from the face cards and being uncorrelated, I wasn’t sure of the total strength of my hand (mugging up probabilities would’ve helped for sure!). So when I had to put in 10 chips to stay in the hand, I baulked, and folded.

Given the play on the table thus far, it was definitely a risk worth taking, and with more in the bank, I would have. But poverty and the Kelly Criterion meant that the number of chips that I was able to invest in the arguably strong hand was limited, and that limited my opportunity to profit from the game.

It is no surprise that the rest of the night petered out for me as my funds dwindled and my ability to play diminished. Maybe I should’ve bought in more when I was down to 20 chips – but then given my ability relative to the rest of the table, that would’ve been good money after bad.

Scott Adams, careers and correlation

I’ve written here earlier about how much I’ve been influenced by Scott Adams’s career advice about “being in top quartile of two or more things“.  To recap, this is what Adams wrote nearly ten years back:

If you want an average successful life, it doesn’t take much planning. Just stay out of trouble, go to school, and apply for jobs you might like. But if you want something extraordinary, you have two paths:

1. Become the best at one specific thing.
2. Become very good (top 25%) at two or more things.

The first strategy is difficult to the point of near impossibility. Few people will ever play in the NBA or make a platinum album. I don’t recommend anyone even try.

Having implemented this to various degrees of success over the last 5-6 years, I propose a small correction – basically to follow the second strategy that Adams has mentioned, you need to take correlation into account.

Basically there’s no joy in becoming very good (top 25%) at two or more correlated things. For example, if you think you’re in the top 25% in terms of “maths and physics” or “maths and computer science” there’s not so much joy because these are correlated skills. Lots of people who are very good at maths are also very good at physics or computer science. So there is nothing special in being very good at such a combination.

Why Adams succeeded was that he was very good at 2-3 things that are largely uncorrelated – drawing, telling jokes and understanding corporate politics are not very correlated to each other. So the combination of these three skills of his was rather unique to find, and their combination resulted in the wildly successful Dilbert.

So the key is this – in order to be wildly successful, you need to be very good (top 25%) at two or three things that are not positively correlated with each other (either orthogonal or negative correlation works). That ensures that if you can put them together, you can offer something that very few others can offer.

Then again, the problem there is that the market for this combination of skills will be highly illiquid – low supply means people who might demand such combinations would have adapted to make do with some easier to find substitute, so demand is lower, and so on. So in that sense, again, it’s a massive hit-or-miss!