Distribution of political values

Through Baal on Twitter I found this “Political Compass” survey. I took it, and it said this is my “political compass”.

Now, I’m not happy with the result. I mean, I’m okay with the average value where the red dot has been put for me, and I think that represents my political leanings rather well. However, what I’m unhappy about is that my political views have been all reduced to one single average point.

I’m pretty sure that based on all the answers I gave in the survey, my political leaning across both the two directions follows a distribution, and the red dot here is only the average (mean, I guess, but could also be median) value of that distribution.

However, there are many ways in which people can have a political view that lands right on my dot – some people might have a consistent but mild political view in favour of or against a particular position. Others might have pretty extreme views – for example, some of my answers might lead you to believe that I’m an extreme right winger, and others might make me look like a Marxist (I believe I have a pretty high variance on both axes around my average value).

So what I would have liked instead from the political compass was a sort of heat map, or at least two marginal distributions, showing how I’m distributed along the two axes, rather than all my views being reduced to one average value.

A version of this is the main argument of this book I read recently called “The End Of Average“. That when we design for “the average man” or “the average customer”, and do so across several dimensions,  we end up designing for nobody, since nobody is average when looked at on many dimensions.

Statistical analysis revisited – machine learning edition

Over ten years ago, I wrote this blog post that I had termed as a “lazy post” – it was an email that I’d written to a mailing list, which I’d then copied onto the blog. It was triggered by someone on the group making an off-hand comment of “doing regression analysis”, and I had set off on a rant about why the misuse of statistics was a massive problem.

Ten years on, I find the post to be quite relevant, except that instead of “statistics”, you just need to say “machine learning” or “data science”. So this is a truly lazy post, where I piggyback on my old post, to talk about the problems with indiscriminate use of data and models.

I had written:

there is this popular view that if there is data, then one ought to do statistical analysis, and draw conclusions from that, and make decisions based on these conclusions. unfortunately, in a large number of cases, the analysis ends up being done by someone who is not very proficient with statistics and who is basically applying formulae rather than using a concept. as long as you are using statistics as concepts, and not as formulae, I think you are fine. but you get into the “ok i see a time series here. let me put regression. never mind the significance levels or stationarity or any other such blah blah but i’ll take decisions based on my regression” then you are likely to get into trouble.

The modern version of this is – everybody wants to do “big data” and “data science”. So if there is some data out there, people will want to draw insights from it. And since it is easy to apply machine learning models (thanks to open source toolkits such as the scikit-learn package in Python), people who don’t understand the models indiscriminately apply it on the data that they have got. So you have people who don’t really understand data or machine learning working with those, and creating models that are dangerous.

As long as people have idea of the models they are using, and the assumptions behind them, and the quality of data that goes into the models, we are fine. However, we are increasingly seeing cases of people using improper or biased data and applying models they don’t understand on top of them, that will have impact that affect the wider world.

So the problem is not with “artificial intelligence” or “machine learning” or “big data” or “data science” or “statistics”. It is with the people who use them incorrectly.

 

Big Data and Fast Frugal Trees

In his excellent podcast episode with EconTalk’s Russ Roberts, psychologist Gerd Gigerenzer introduces the concept of “fast and frugal trees“. When someone needs to make decisions quickly, Gigerenzer says, they don’t take into account a large number of factors, but instead rely on a small set of thumb rules.

The podcast itself is based on Gigerenzer’s 2009 book Gut Feelings. Based on how awesome the podcast was, I read the book, but found that it didn’t offer too much more than what the podcast itself had to offer.

Coming back to fast and frugal trees..

In recent times, ever since “big data” became a “thing” in the early 2010s, it is popular for companies to tout the complexity of their decision algorithms, and machine learning systems. An easy way for companies to display this complexity is to talk about the number of variables they take into account while making a decision.

For example, you can have “fin-tech” lenders who claim to use “thousands of data points” on their prospective customers’ histories to determine whether to give out a loan. A similar number of data points is used to evaluate resumes and determine if a candidate should be called for an interview.

With cheap data storage and compute power, it has become rather fashionable to “use all the data available” and build complex machine learning models (which aren’t that complex to build) for decisions that were earlier made by humans. The problem with this is that this can sometimes result in over-fitting (system learning something that it shouldn’t be learning) which can lead to disastrous predictive power.

In his podcast, Gigerenzer talks about fast and frugal trees, and says that humans in general don’t use too many data points to make their decisions. Instead, for each decision, they build a quick “fast and frugal tree” and make their decision based on their gut feelings about a small number of data points. What data points to use is determined primarily based on their experience (not cow-like experience), and can vary by person and situation.

The advantage of fast and frugal trees is that the model is simple, and so has little scope for overfitting. Moreover, as the name describes, the decision process is rather “fast”, and you don’t have to collect all possible data points before you make a decision. The problem with productionising the fast and frugal tree, however, is that each user’s decision making process is different, and about how we can learn that decision making process to make the most optimal decisions at a personalised level.

How you can learn someone’s decision-making process (when you’ve assumed it’s a fast and frugal tree) is not trivial, but if you can figure it out, then you can build significantly superior recommender systems.

If you’re Netflix, for example, you might figure that someone makes their movie choices based only on age of movie and its IMDB score. So their screen is customised to show just these two parameters. Someone else might be making their decisions based on who the lead actors are, and they need to be shown that information along with the recommendations.

Another book I read recently was Todd Rose’s The End of Average. The book makes the powerful point that nobody really is average, especially when you’re looking a large number of dimensions, so designing for average means you’re designing for nobody.

I imagine that is one reason why a lot of recommender systems (Netflix or Amazon or Tinder) fail is that they model for the average, building one massive machine learning system, rather than learning each person’s fast and frugal tree.

The latter isn’t easy, but if it can be done, it can result in a significantly superior user experience!

Liverpool FC: Mid Season Review

After 20 games played, Liverpool are sitting pretty on top of the Premier League with 58 points (out of a possible 60). The only jitter in the campaign so far came in a draw away at Manchester United.

I made what I think is a cool graph to put this performance in perspective. I looked at Liverpool’s points tally at the end of the first 19 match days through the length of the Premier League, and looked at “progress” (the data for last night’s win against Sheffield isn’t yet up on my dataset, which also doesn’t include data for the 1992-93 season, so those are left out).

Given the strength of this season’s performance, I don’t think there’s that much information in the graph, but here it goes in any case:

I’ve coloured all the seasons where Liverpool were the title contenders. A few things stand out:

  1. This season, while great, isn’t that much better than the last one. Last season, Liverpool had three draws in the first half of the league (Man City at home, Chelsea away and Arsenal away). It was the first month of the second half where the campaign faltered (starting with the loss to Man City).
  2. This possibly went under the radar, but Liverpool had a fantastic start to the 2016-17 season as well, with 43 points at the halfway stage. To put that in perspective, this was one more than the points total at that stage in the title-chasing 2008-9 season.
  3. Liverpool went close in 2013-14, but in terms of points, the halfway performance wasn’t anything to write home about. That was also back in the time when teams didn’t dominate like nowadays, and eighty odd points was enough to win the league.

This is what Liverpool’s full season looked like (note that I’ve used a different kind of graph here. Not sure which one is better).

 

Finally, what’s the relationship between points at the end of the first half of the season (19 games) and the full season? Let’s run a regression across all teams, across all 38 game EPL seasons.

The regression doesn’t turn out to be THAT significant, with an R Squared of 41%. In other words, a team’s points tally at the halfway point in the season explains less than 50% of the variation in the points tally that the team will get in the second half of the season.

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  9.42967    0.97671   9.655   <2e-16 ***
Midway       0.64126    0.03549  18.070   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 6.992 on 478 degrees of freedom
  (20 observations deleted due to missingness)
Multiple R-squared:  0.4059,    Adjusted R-squared:  0.4046 
F-statistic: 326.5 on 1 and 478 DF,  p-value: < 2.2e-16

The interesting thing is that the coefficient of the midway score is less than 1, which implies that teams’ performances at the end of the season (literally) regress to the mean.

55 points at the end of the first 19 games is projected to translate to 100 at the end of the season. In fact, based on this regression model run on the first 19 games of the season, Liverpool should win the title by a canter.

PS: Look at the bottom of this projections table. It seems like for the first time in a very long time, the “magical” 40 points might be necessary to stave off relegation. Then again, it’s regression (pun intended).

This year on Spotify

I’m rather disappointed with my end-of-year Spotify report this year. I mean, I know it’s automated analytics, and no human has really verified it, etc.  but there are some basics that the algorithm failed to cover.

The first few slides of my “annual report” told me that my listening changed by seasons. That in January to March, my favourite artists were Black Sabbath and Pink Floyd, and from April to June they were Becky Hill and Meduza. And that from July onwards it was Sigala.

Now, there was a life-changing event that happened in late March which Spotify knows about, but failed to acknowledge in the report – I moved from the UK to India. And in India, Spotify’s inventory is far smaller than it is in the UK. So some of the bands I used to listen to heavily in the UK, like Black Sabbath, went off my playlist in India. My daughter’s lullaby playlist, which is the most consumed music for me, moved from Spotify to Amazon Music (and more recently to Apple Music).

The other thing with my Spotify use-case is that it’s not just me who listens to it. I share the account with my wife and daughter, and while I know that Spotify has an algorithm for filtering out kid stuff, I’m surprised it didn’t figure out that two people are sharing this account (and pitched us a family subscription).

According to the report, these are the most listened to genres in 2019:

Now there are two clear classes of genres here. I’m surprised that Spotify failed to pick it out. Moreover, the devices associated with my account that play Rock or Power Metal are disjoint from the devices that play Pop, EDM or House. It’s almost like Spotify didn’t want to admit that people share accounts.

Then some three slides on my podcast listening for the year, when I’ve overall listened to five hours of podcasts using Spotify. If I, a human, were building this report, I would have dropped this section citing insufficient data, rather than wasting three slides with analytics that simply don’t make sense.

I see the importance of this segment in Spotify’s report, since they want to focus more on podcasts (being an “audio company” rather than a “music company”), but maybe something in the report to encourage me to use Spotify for more podcasts (maybe recommending Spotify’s exclusive podcasts that I might like, be it based on limited data?) might have helped.

Finally, take a look at my our most played songs in 2019.

It looks like my daughter’s sleeping playlist threaded with my wife’s favourite songs (after a point the latter dominate). “My songs” are nowhere to be found – I have to go all the way down to number 23 to find Judas Priest’s cover of Diamonds and Rust. I mean I know I’ve been diversifying the kind of music that I listen to, while my wife listens to pretty much the same stuff over and over again!

In any case, automated analytics is all fine, but there are some not-so-edge cases where the reports that it generates is obviously bad. Hopefully the people at Spotify will figure this out and use more intelligence in producing next year’s report!

Spurs right to sack Pochettino?

A few months back, I built my “football club elo by manager” visualisation. Essentially, we take the week-by-week Premier League Elo ratings from ClubElo and overlay it with managerial tenures.

A clear pattern emerges – a lot of Premier League sackings have been consistent with clubs going down significantly in terms of Elo Ratings. For example, we have seen that Liverpool sacked Rafa Benitez, Kenny Dalglish (in 2012) and Brendan Rodgers all at the right time, and that similarly Manchester United sacked Jose Mourinho when he brought them back to below where he started.

And now the news comes in that Spurs have joined the party, sacking long-time coach Mauricio Pochettino. What I find interesting is the timing of the sacking – while international breaks are usually a popular time to change managers (the two week gap in fixtures gives a club some time to adjust), most sackings happen in the first week of the international break.

The Pochettino sacking is surprising in that it has come towards the end of the international break, giving the club four days before their next fixture (a derby at the struggling West Ham). However, the Guardian reports that Spurs are close to hiring Jose Mourinho, and that might explain the timing of the sacking.

So were Spurs right in sacking Pochettino, barely six months after he took them to a Champions League final? Let’s look at the Spurs story under Pochettino using Elo ratings. 

 

 

 

 

Pochettino took over in 2014 after an underwhelming 2013-14 when the club struggled under Andre Villas Boas and then Tim Sherwood. Initially, results weren’t too promising, as he took them from a 1800 rating down to 1700.

However, chairman Daniel Levy’s patience paid off, and the club mounted a serious challenge to Leicester in the 2015-16 season before falling away towards the end of the season, finishing third behind Arsenal. As the Elo shows, the improvement continued, as the club remained in Champions League places through the course of Pochettino’s reign.

Personally, the “highlight” of Pochettino’s reign was Spurs’ 4-1 demolition of Liverpool at Wembley in October 2017, a game I happened to watch at the stadium. And as per the Elo ratings the club plateaued shortly after that.

If that plateau had continued,  I suppose Pochettino would have remained in his job, giving the team regular Champions League football. This season, however, has been a disaster.

Spurs are 13 points below what they had scored in comparable fixtures last season, and unlikely to finish in the top six even. Their Elo has also dropped below 1850 for the first time since 2016-17. While that is still higher than where Pochettino started off at, the precipitous drop in recent times has meant that the club has possibly taken the right call in sacking Pochettino.

If Mourinho does replace him (it looks likely, as per the Guardian), it will present a personal problem for me – for over a decade now, Tottenham have been my “second team” in the top half of the Premier League, behind Liverpool. That cannot continue if Mourinho takes over. I’m wondering who to shift my allegiance to – it will have to be either Leicester or (horror of horrors) Chelsea!

Alchemy

Over the last 4-5 days I kinda immersed myself in finishing Rory Sutherland’s excellent book Alchemy.

It all started with a podcast, with Sutherland being the guest on Russ Roberts’ EconTalk last week. I’d barely listened to half the podcast when I knew that I wanted more of Sutherland, and so immediately bought the book on Kindle. The same evening, I finished my previous book and started reading this.

Sometimes I get a bit concerned that I’m agreeing with an author too much. What made this book “interesting” is that Sutherland is an ad-man and a marketer, and keeps talking down on data and economics, and plays up intuition and “feeling”. In other words, at least as far as professional career and leanings go, he is possibly as far from me as it gets. Yet, I found myself silently nodding in agreement as I went through the book.

If I have to summarise the book in one line I would say, “most decisions are made intuitively or based on feeling. Data and logic are mainly used to rationalise decisions rather than making them”.

And if you think about it, it’s mostly true. For example, you don’t use physics to calculate how much to press down on your car accelerator while driving – you do it essentially by trial and error and using your intuition to gauge the feedback. Similarly, a ball player doesn’t need to know any kinematics or projectile motion to know how to throw or hit or catch a ball.

The other thing that Sutherland repeatedly alludes to is that we tend to try and optimise things that are easy to measure or optimise. Financials are a good example of that. This decade, with the “big data revolution” being followed by the rise of “data science”, the amount of data available to make decisions has been endless, meaning that more and more decisions are being made using data.

The trouble, of course, is availability bias, or what I call as the “keys-under-lamppost bias”. We tend to optimise and make decisions on things that are easily measurable (this set of course is now much larger than it was a decade ago), and now that we know we are making use of more objective stuff, we have irrational confidence in our decisions.

Sutherland talks about barbell strategies, ergodicity, why big data leads to bullshit, why it is important to look for solutions beyond the scope of the immediate domain and the Dunning-Kruger effect. He makes statements such as “I would rather run a business with no mathematicians than with second-rate mathematicians“, which exactly mirrors my opinion of the “data science industry”.

There is absolutely no doubt why I liked the book.

Thinking again, while I said that professionally Sutherland seems as far from me as possible, it’s possibly not so true. While I do use a fair bit of data and economic analysis as part of my consulting work, I find that I make most of my decisions finally on intuition. Data is there to guide me, but the decision-making is always an intuitive process.

In late 2017, when I briefly worked in an ill-fated job in “data science”, I’d made a document about the benefits of combining data analysis with human insight. And if I think about my work, my least favourite work is where I’ve done work with data to help clients make “logical decision” (as Sutherland puts it).

The work I’ve enjoyed the most has been where I’ve used the data and presented it in ways in which my clients and I have noticed patterns, rationalised them and then taken a (intuitive) leap of faith into what the right course of action may be.

And this also means that over time I’ve been moving away from work that involves building models (the output is too “precise” to interest me), and take on more “strategic” stuff where there is a fair amount of intuition riding on top of the data.

Back to the book, I’m so impressed with it that in case I was still living in London, I would have pestered Sutherland to meet me, and then tried to convince him to let me work for him. Even if at the top level it seems like his work and mine are diametrically opposite..

I leave you with my highlights and notes from the book, and this tweet.

Here’s my book, in case you are interested.

 

EPL: Mid-Season Review

Going into the November international break, Liverpool are eight points ahead at the top of the Premier League. Defending champions Manchester City have slipped to fourth place following their loss to Liverpool. The question most commentators are asking is if Liverpool can hold on to this lead.

We are two-thirds of the way through the first round robin of the premier league. The thing with evaluating league standings midway through the round robin is that it doesn’t account for the fixture list. For example, Liverpool have finished playing the rest of the “big six” (or seven, if you include Leicester), but Manchester City have many games to go among the top teams.

So my practice over the years has been to compare team performance to corresponding fixtures in the previous season, and to look at the points difference. Then, assuming the rest of the season goes just like last year, we can project who is likely to end up where.

Now, relegation and promotion introduces a source of complication, but we can “solve” that by replacing last season’s relegated teams with this season’s promoted teams (18th by Championship winners, 19th by Championship runners-up, and 20th by Championship playoff winners).

It’s not the first time I’m doing this analysis. I’d done it once in 2013-14, and once in 2014-15. You will notice that the graphs look similar as well – that’s how lazy I am.

Anyways, this is the points differential thus far compared to corresponding fixtures of last season. 

 

 

 

Leicester are the most improved team from last season, having scored 8 points more than in corresponding fixtures from last season. Sheffield United, albeit starting from a low base, have done extremely well as well. And last season’s runners-up Liverpool are on a plus 6.

The team that has done worst relative to last season is Tottenham Hotspur, at minus 13. Key players entering the final years of their contract and not signing extensions, and scanty recruitment over the last 2-3 years, haven’t helped. And then there is Manchester City at minus 9!

So assuming the rest of the season’s fixtures go according to last season’s corresponding fixtures, what will the final table look  like at the end of the season?
We see that if Liverpool replicate their results from last season for the rest of the fixtures, they should win the league comfortably.

What is more interesting is the gaps between 1-2, 2-3 and 3-4. Each of the top three positions is likely to be decided “comfortably”, with a fairly congested mid-table.

As mentioned earlier, this kind of analysis is unfair to the promoted teams. It is highly unlikely that Sheffield will get relegated based on the start they’ve had.

We’ll repeat this analysis after a couple of months to see where the league stands!

Segmentation and machine learning

For best results, use machine learning to do customer segmentation, but then get humans with domain knowledge to validate the segments

There are two common ways in which people do customer segmentation. The “traditional” method is to manually define the axes through which the customers will get segmented, and then simply look through the data to find the characteristics and size of each segment.

Then there is the “data science” way of doing it, which is to ignore all intuition, and simply use some method such as K-means clustering and “do gymnastics” with the data and find the clusters.

A quantitative extreme of this method is to do gymnastics with your data, get segments out of it, and quantitatively “take action” on it without really bothering to figure out what each clusters represent. Loosely speaking, this is how a lot of recommendation systems nowadays work – some algorithm somewhere finds people similar to you based on your behaviour, and recommends to you what they liked.

I usually prefer a sort of middle ground. I like to let the algorithms (k-means easily being my favourite) to come up with the segments based on the data, and then have a bunch of humans look at the segments and make sense of it.

Basically whatever segments are thrown up by the algorithm need to be validated by human intuition. Getting counterintuitive clusters is also not a problem – on several occasions, people I’ve validated the clusters by (usually clients) have used the counterintuitive clusters to discover bugs, gaps in the data  or patterns that they didn’t know of earlier.

Also, in terms of validation of clusters, it is always useful to get people with domain knowledge to validate the clusters. And this also means that whatever clusters you’ve generated you are able to represent them in a human-readable format. The best way of doing that is to use the cluster centres and then represent them somehow in a “physical” manner.

I started writing this post some three days ago and am only getting to finish it now. Unfortunately, in the meantime I’ve forgotten the exact motivation of why I started writing this. If i recall that, I’ll maybe do another post.

Fishing in data pukes

When a data puke is presented periodically, consumers of the puke learn to “fish” for insights in it. 

I’ve been wondering why data pukes are so common. After all, they need significant effort on behalf of the consumer to understand what is happening, and to get any sort of insight from it. In contrast, a well-designed dashboard presents the information crisply and concisely.

In practical life, though, most business reports and dashboards I come across can at best be described as data pukes. There is data all over the place, and little insight to help the consumer to find what they’re looking for. In most cases, there is no customisation as well.

The thing with data  pukes is that data pukes beget data pukes. The first time you come across a report or dashboard that is a data puke, and you have no choice but to consume it, you work hard to get your useful nuggets from it. The next time you come across the same data puke (most likely in the next edition of the report, or the next time you come across the dashboard), it takes less effort for you to get your insight. Soon enough, having been exposed to the data puke multiple times, you become an expert at getting insight out of it.

Your ability to sift through this particular data puke and make sense of it becomes your competitive advantage. And so you demand that the puker continue to puke out the data in the same manner. Even if they were to figure out that they can present it in a better way, you (and people like you) will have none of that, for that will then chip away at your competitive advantage.

And so the data puke continues.