What’s in a shirt number?

There is a traditional way of allotting shirt numbers to football players. “Back to front, right to left”, goes the rule. The goalkeeper is thus number 1. Irrespective of the system used, the right back is number 2, and usually the left forward/winger is number 11.

Now, the way different teams allot numbers depends upon their historical formations, and how their current formations have evolved from those historical formations. The two historical formations that are the 2-3-5 (mostly played in Europe) and the W-M (which originated in South America).

You can read Jonathan Wilson’s excellent Inverting the Pyramid to find more about how formations evolved. This post, however, is about shirt numbers in the ongoing world cup.

Now, given the way numbering has evolved in different countries, each number (between 1 and 11) has a traditional set of roles involved. 1 is the goalie everywhere, 2 is right back everywhere. 3 is left back in Europe, but right centre back in South America. 4 is central midfielder in England, but centre back in Spain and South America. 6 is a central defender in England, left back in Brazil/Argentina and a midfielder in Spain.

These are essentially numbering conventions based on how numbering systems have evolved, but are seldom a rule. However, such conventions are so ingrained in the traditional football watcher’s mind that when a player wears a shirt number that is not normally associated with his position, it appears “wrong”.

For example, William Gallas, a centre back (and occasional right back for Chelsea) by trade moved to Arsenal in 2006 and promptly got the number 10 shirt, which is usually reserved for a central attacker/attacking midfielder (in fact, the number now defines the role – it is simply called the “number 10 role”). In the last season, West Ham used two successive left backs (Razvan Rat and Pablo Armero) last season, and both were allotted number 8 – traditionally allocated to a central midfielder.

In this post, we will look at the squads of the ongoing world cup and try and understand how many players are wearing “wrong” shirt numbers. In order to do this, we look at the most common roles associated with a particular number, and identify any players that don’t fit this convention.

Figures 1 and 2 have the summary of the distribution of roles according to shirt number.




As we can see, all number 1s are goalkeepers (perhaps there is a FIFA rule to this effect). Most number 2s and 3s are defenders, but there is the odd midfielder and forward also who wears this. Iranian forward Khosro Heydari wears 2, as do Greek midfielder Ioannais Maniatis and Bosnian  midfielder Avdija Vrsajevic.

The most unnatural number 3 (in his defence, he’s always worn 3) is Ghanaian striker Asamoah Gyan. Iranian midfielder Ehsan Aji Safi also wears a 3, contrary to convention.

As discussed earlier, midfielders from a few countries wear 4, but there are also two forwards who wear that number – Japanese Keisuke Honda and Australian Tim Cahill. This can be explained by the fact that both of them started off as midfielders, and then turned into forwards, but perhaps wanted to keep their original numbers.

5 is split entirely between defenders and midfielders, who also make up for most of the number 6s. The one exception to this is Russia’s Maksim Kanunnikov, who is a forward. Interestingly, as many as six number 7s (associated with a right winger in both 2-3-5 and W-M systems) are listed as defenders! This includes Colombia’s left back Armero who notoriously wore 8 for West Ham last year. This might possibly be explained by players who started off as wingers and then moved back, but kept their numbers. Two defenders – Costa Rica’s Heiner Mora and Australia’s Bailey Wright wear number 8.

Number 9 is again one of those numbers which is associated with a specific role – a centre forward. In fact, in recent times, there is a variation of this called the “false nine” (there is also a “false ten” now). We would thus expect that all number nines are number nines, but a few midfielders also get that number. Prominent among those is Newcastle’s Cheick Tiote, who wears 9 for Cote D’Ivoire.

10 is split between midfielders and forwards (as expected), but a few defenders wear 11. Croatian captain and right back Darijo Srna wears 11, as also does Greek defender Loukas Vyntra.

Beyond 11, there is no real convention in terms of shirt numbering. The only interesting thing is in the numbers allotted to the reserve goalkeepers (notice that no goalies take any number between 2 and 11). By far, 12 is the most popular number allotted to the reserve goalkeeper, but some teams use 13 as well. Then, 22 and 23 are also pretty popular numbers for goalkeepers.

Finally, we saw that Iran was the culprit in allocating numbers 2 and 3 to non-defenders. Greece, too, came up as a repeat offender in terms of allocating inappropriate numbers. Can we build a “number convention index” and see which countries deviated most from the numbering conventions?

Now, there are degrees in being unconventional, and these need to be accommodated into the analysis. For example, a midfielder wearing 4 (there are 6 of them) is pretty normal, but a forward wearing 3 is simply plain wrong. A forward wearing 8 is not “correct”, but not “wrong” either – this shows that we need more than a simple binary scoring system.

What we will do is to first identify the most common player type for each number, and every such player will get a score of 1. For every other player wearing that number, the score will be the number of such players wearing that number divided by the number of players wearing that number who occupy the most popular position for that number.

I’m assuming the last paragraph didn’t make sense so let me use an example. To use number 2, the most popular position for a number 2 is in defence, so every defender who wears 2 gets 1 point. There are two defenders who wear 2, compared to 29 defenders who wear 2. Thus, each defender who wears 2 gets 2/29 points. One forward wears 2, and he gets 1/29 point.

Taking number 10, the most common position for the number is forward (there are 17 of them), and they all get 1 point. The remaining 15 players who wear 10 are all midfielders, and they get 15/17 points (notice this is not so much less than 1).

This way, each member of each squad gets allotted points based on how “normal” his shirt number is given his position. Summing up the points across players of a team, we get a team score on how “natural” the shirt numbers are. The maximum score a team can get is 23 (each player wearing a number appropriate for his position).

Table 3 here has the team-wise information on correctness of shirt numbers. The team with the worst allocated shirt numbers happens to be Nigeria with 16.13. At the other end, the team that has allocated numbers most appropriately is Ecuador, with 21.

Country  Score
Nigeria         16.13
Costa Rica         16.85
Greece         17.22
Australia         17.45
Iran         17.69
Ivory Coast         17.74
Colombia         18.19
Cameroon         18.19
Argentina         18.29
USA         18.41
Italy         18.63
Portugal         18.66
Algeria         18.68
Honduras         18.69
France         19.02
Ghana         19.06
Croatia         19.08
Netherlands         19.20
Mexico         19.23
Brazil         19.28
Chile         19.37
Japan         19.52
South Korea         19.69
England         19.77
Russia         20.03
Switzerland         20.04
Uruguay         20.25
Spain         20.27
Bosnia & Herzegovina         20.37
Belgium         20.83
Germany         20.86
Ecuador         21.01


This, however, may not tell the complete story. As we saw earlier, conventions regarding numbers between 12 and 23 are not as strict, and thus these numbers can get allocated in a more random fashion compared to 1-11. There are absolutely no taboos related to numbers 12-23, and thus, misallocating them is less of a crime than misallocating 1-11.

Hence, we will look at the numbers 1 to 11, and see how teams have performed. Table 4 has this information:

Country  Score
Australia           7.45
USA           8.04
Iran           8.18
Greece           8.59
Nigeria           8.63
Ivory Coast           8.71
Costa Rica           8.84
Ghana           8.95
Croatia           8.96
Japan           9.06
Colombia           9.27
Brazil           9.29
Spain           9.66
Bosnia & Herzegovina           9.67
Uruguay           9.70
Honduras           9.71
Portugal           9.78
Italy           9.81
South Korea           9.83
Cameroon           9.83
Chile           9.87
Russia           9.94
England           9.97
Argentina         10.03
Switzerland         10.18
Netherlands         10.24
Algeria         10.30
Ecuador         10.34
Mexico         10.35
France         10.53
Belgium         10.88
Germany         11.00


Germany has a “perfect” first eleven, in terms of number allocation. Belgium comes close. At the other end of the scale, we have Australia, which seems to have the most misallocated 1-11 shirt numbers. Iran and Greece, which we anecdotally saw as having high misallocations are at three and four, with the United States at 2.

Note: The data is taken from the Guardian Data Blog. Now, this analysis should be taken with some salt since in the modern game, the division of players into “defender”, “midfielder” and “forward” is not straightforward. Where would you put a “classic number ten”? What about a wing back? And so forth.

Money can buy me Premier League performance

The following graph plots the premier league performance (in terms of points) for the 2012-13 season as a function of the team’s wage bill. Apart from a few outliers here and there the correlation is astounding:



The red line is the line of best fit (according to a linear regression) and comparing team standings with respect to the line shows how well teams performed relative to what their wages would predict.

It is interesting to see that Manchester City almost fall off the charts in terms of wages, yet they could not translate this to on-pitch performance. It can also be seen that Manchester United, Spurs and Everton significantly over-performed given their wage bills.

Based on the wage bill, it would have also been reasonably easy to predict that Wigan Athletic and Reading would get relegated at the end of the season – though it must be mentioned they underperformed their wage bills, but QPR should have done a lot better given the size of their pay packet.

A simple linear regression of points against wage bill shows that every GBP 4 million increase in the wage bill leads to one additional point in the premier league! And the regression has an R-square of 69% – which means that the team’s wage bill can predict 69% of the variation in the team’s performance! Which is extremely significant.

The screenshot of the regression is given below: wagerank


Note that in this post we only use the wage bill and not any transfer fees paid. However, the assumption is that the two are reasonably correlated and we are not losing out on any information by using only the wage bill.



Classifying cricket grounds

For some work I’m trying to classify cricket grounds. The question is if we can classify cricket grounds based on what kind of cricket they support. Some pitches are slow and low – it is hard to score runs, but also hard to get the batsman out. Some others are fast and bouncy – easy to score and easy to get out. Then you have the “batting pitches” – easy to score and hard to get out and “bowling pitches” – hard to score but easy to take wickets.

Essentially I’m trying to see if I can classify a ground into one of the above four regimes (or a superposition of them) at different stages in a game – this will help estimate how the rest of the game is going to play out.

For this, I was looking at the runs per ball and balls per wicket statistic for a number of grounds based on T20 matches. All grounds which hosted over 10 T20 matches (international or IPL) before the 10th of April have been considered for this analysis. It is interesting, to say the least.

Here is the scatter plot – bottom right (only the Oval) is easy to score, easy to get out. Top right are the batting pitches, bottom left the bowling pitches and top left the slow-and-low! It is interesting that the “most bowling pitch” of the lot is Chittagong! The only Indian ground that can be classified thus is DY Patil Sports Academy in Navi Mumbai!


Wasps have thin tails, or why cricket prediction algorithms fail..

A couple of months back, i was watching a One Day International match between New Zealand and India. it was during this series that the WASP algorithm for predicting score at the end of the innings (for the team batting first) and chances of victory (for the team batting second) first came in to public prominence. During the games, the scorecard would include the above data, which was derived from the WASP algorithm.

This one game that I was watching (I think it was the fourth match of the series), New Zealand was chasing. They were fifty odd without loss, and WASP showed their chances of winning as somewhere around 60%. Then Jesse Ryder got out. Suddenly, the chances of winning, as shown on screen dropped to the forties! How can the fall of one wicket, and at an early stage in the game, influence a game so much? The problem is with the algorithm.

Last year, during the IPL, i tried running this graphic that I called the “Tug-of-War”, that was to depict how the game swung between the two teams. Taking the analogy forward, if you were to imagine the game of cricket as a game of Tug-of-War, the graph plotted the position of the middle of the rope as a function of time. Here is a sample graphic from last year:



This shows how the game between the Pune Warriors and Sun Risers “moved”. The upper end of the graph represents the Sun Risers’ “line” and the lower end the line of the Pune Warriors. So we can see from this graph that for the first half of the Sun Risers innings (they batted first), the Sun Risers were marginally ahead. Then Pune pulled it back in the second half, and then somewhere midway through the Pune Innings, the SunRisers pulled it back again, and eventually won the game.

At least that’s the intention with which I started putting out this graphic. In practice, you can see that there is a problem. Check out the graph somewhere around the 8th over of the Pune innings. This was when Marlon Samuels got out. How can one event change the course of the game so dramatically? It was similar to the movement in the WASP when Ryder got out in the recent NZ-India match.

So what is the problem here? Based on the WASP algorithm that the designers have kindly published, and the algorithm I used for last year’s IPL (which was Monte Carlo-based), the one thing common is that both algorithms are Markovian (I know mine is, and from what WASP has put out, I’m guessing theirs is, too). To explain in English, what our algorithms assume is that what happens in the next ball doesn’t depend on what has happened so far. The odds of different events on the next ball (dot, six, single, out, etc.) are independent of how the previous balls have shaped up – this is the assumption that our algorithms use. And since that doesn’t accurately represent what happens in a cricket match, we end up with “thin tails”.

Recently, to evaluate IPL matches, with a view of evaluating players ahead of the auction, I reverse engineered the WASP algorithm, and decided to see what it says about the score at the end of an ODI innings. Note that my version is team agnostic, and assumes that every ball is bowled by “the average bowler” to “the average batsman”. The distribution of team score at the end of the first innings, as calculated by my algorithm, can be seen in the blue line in the graph below. The red line shows the actual distribution of score at the end of an ODI innings in the last 5 years (same data that’s been used to construct the model).


Note how the blue curve has a much higher peak, and tails off very quickly on either side. In other words, a lot of “mass” is situated within a small range of scores, and this leads to the bizarre situations as you can see in the first graph, and what I saw in the New Zealand India game.

The problem with a dynamic programming based approach, such as WASP, is that you need to make a Markovian assumption, and that assumption results in thin tails. And when you are trying to predict the probability of victory, and are using a curve such as the blue one above as your expected distribution of score at the end of the innings, events such as a six or a wicket can drastically alter your calculated odds.

To improve the cricket prediction system, what we need is an algorithm that can replicate the “fat tails” that the actual distribution of cricket scores shows. My current Monte Carlo based algorithm doesn’t cut it. Neither does the WASP.

The pressure of chasing a target in One Day Internationals

I was looking at the average runs scored per over in One Day Internationals from 2009 onwards (data from cricsheet ). The data is presented in the graph below. What is striking is the difference in runs scored per over between the team batting first and second.




The  blue line shows the runs per over for the team batting first, and the red line for the team batting second. These figures are averaged over all ODIs from 2009 till the end of the recent Asia Cup. What you will notice is that the way you score runs in the first and second innings is different.

For the first part of the innings, till almost over 35, the team batting second scores much faster than the team batting first. Then somewhere around over 40, the two lines cross, And then the blue line pulls away from the red one – and really fast.

In the last over of the innings, for example, the team batting first is expected to score ten runs, while the team batting second is expected to score only eight and a half. In the forty fifth over, the team batting first scores seven runs on an average, while the chasing team only scores six!

The difference in scoring patterns is striking, and the only possible explanation is the pressure of chasing! When you have a target in mind, and you are chasing, you are unable to bat as freely as you do when you are setting a target. Consequently, you are not able to score as many runs!

The next question is if there is a variation across teams. Given below is the same graph as above, but plotted by batting team.




The graphs are smaller, so the gaps aren’t too visible, but if you look for a gap between the blue and red lines by team, you will find that the biggest gaps are for India, New Zealand and Australia! Sri Lanka and Pakistan seem to bat similarly, however, irrespective of whether they are setting a target or chasing!



Privacy and network effects

It is intuitive that some people are more concerned about their privacy than others. These people usually connect to the internet via a VPN (to prevent snooping), do not use popular applications because they rank marginally lower on privacy (not using Facebook, for example), and are strict about using only those apps on their phones that don’t ask for too much privacy-revealing information.

The vast  majority, however, is not particularly concerned about privacy – as long as a reasonable amount of privacy exists, and their basic transactions are safe, they are happy to use any service that is of value to them.

Now, with the purchase of WhatsApp by Facebook, the former (more concerned about privacy) brand of people are concerned that WhatsApp, which famously refused to collect user data, did not store messages and did not show advertisements, is now going to move to the “dark side”. Facebook, in the opinion of some of these people, is notorious for its constant changing of privacy terms (making it harder for you to truly secure your data there), and they suspect that WhatsApp will go the same way sooner rather than later. And they have begun their search to move away from WhatsApp to an alternate messenger service.

The problem, however, is that WhatsApp is a network effect based service. A messenger service is of no use to you if your friends don’t use it. Blackberry messenger, for example, was limited in its growth because only users with blackberries used it (before they belatedly released an android app). With people moving away from Blackberries (in favour of iOS and Android), BBM essentially died.

I see posts on my facebook and twitter timelines asking people to move to this messenger service called Telegraph, which is supposedly superior to Facebook in its privacy settings. i also see posts that show that Telegraph is not all that better, and you are better off sticking to WhatsApp. Based on these posts, it seems likely that some people might want to move away from WhatsApp.  The question is if network effects will allow them to do so.

Email is not a network effect based service. I can use my GMail to email anyone with a valid email address, irrespective of who their provider is. This allows for people with more esoteric preferences to choose an email provider of their choice without compromising on connectivity. The problem is the same doesn’t apply to messenger service – which are app-locked. You can use WhatsApp to only message friends who also have WhatsApp. Thus, the success (or lack of it) of messenger services will be primarily driven by network effects.

For whatever reasons, WhatsApp has got a significant market share in messenger applications, and going by network effects, their fast pace of growth is expected to continue. The problem for people concerned about privacy is that it is useless for them to move to a different service, because their less privacy conscious friends are unlikely to make the move along with them. Unless they want to stop using messenger services altogether, they are going to be locked in to WhatsApp thanks to network effects!

There is one upside to this for those of us who are normally not so worried about privacy. That these privacy conscious people are locked in to WhatsApp (thanks to network effects) implies that there will always be this section of WhatsApp users who are conscious about privacy, and vocal about it. Their activism is going to put pressure on the company to not dilute its privacy standards. And this is going to benefit all users of the service!

Was the RR-CSK match on 12 May 2013 fixed?

The Justice Mudgal committee which looked into possible corruption in the Indian Premier League has mentioned that the game between Rajasthan Royals and Chennai Super Kings played in Jaipur on the 12th of May 2013 was possibly fixed. CSK, batting first, were 83 without loss in 11 overs, at which point their “mascot” (let’s call him that since his official status is unclear) Gurunath Meiyappan allegedly said that the team was unlikely to score over 140 (refer to the video with Gaurav Kalra and Sharda Ugra on Cricinfo). The team finished on 141, with Dwayne Bravo finishing with a quick 23 in 11 balls.

I have an algorithm similar to the WASP algorithm used in the recent New Zealand-India ODI series which I use to evaluate player performances in each game. For this particular game, the following table shows the batting ratings (according to this algorithm) for various players.


You can notice that apart from Dwayne Bravo, all batsmen from Chennai Super Kings (look at the batting column) had a negative rating. The two players with the most negative batting ratings, you can see, are Ravindra Jadeja and M Vijay. The question is which of these two was more culpable for the innings slowdown in the latter half.

Our algorithm allows us to analyze performances in parts of innings, so let us break down the innings into two – before Mike Hussey got out (on 11.3 overs) and after. When did Vijay collect his -10 score?


Vijay started slowly, getting to a cumulative -5.3 after four overs. Then, starting in the sixth over he started hitting out. By the time Hussey got out in the twelfth over, he was at 10.17. Raina and Dhoni both perished in the 13th over. At the end of that, Vijay was at a still respectable 8.49 (the wickets falling having evidently slowed him down). And then Ravindra Jadeja walked in.

For the next four overs, when Vijay was at the crease, he diminished his team’s chances of winning by 8%, 4.5%, 5.6% and 1% respectively (total of 19%). He then got out, and Bravo came in to make amends and take the team to 141. What of Jadeja?


It is interesting to note that Jadeja held steady while Vijay was slowing things down (overs 13 to 17), but once Vijay got out, he had two massively horrible 18th and 20th overs (he didn’t get to bat in the 19th, when Bravo took all the strike!).

Was it the handiwork of some particular bowler that Jadeja was quietened in overs 18 and 20? No! The following graph shows the over-wise performance of Chennai Super Kings (a negative number means Rajasthan Royals got the upper hand in that over). Colour of the bars vary by bowler. No one bowler did superlatively well for RR.


The negatives in the 12th and 13th overs are on accounts of wickets falling. And then there is a series of negatives, with Vijay and Jadeja batting. Then Bravo comes in, gets himself a positive, but Jadeja continues to get really negative. And it’s not really one bowler who bowled superlatively well.

Draw your own conclusions.




Sangakkara and the IPL Auction

Sri Lankan cricketer Kumar Sangakkara has decided to not participate in this year’s IPL auction. In the opinion of this blog, this is an extremely smart decision, for Sri Lankan cricketers are unlikely to be available for a large part of this year’s IPL, thanks to their tour of England starting in May. Let me explain.

The IPL Auction is a strange beast. Each team has a salary cap, and players are auctioned across teams such that a team spends no more than its salary cap (in total). Now, in case a player is not available for a particular part of the tournament (due to a clash in schedules due to international commitments, essentially), the fees paid to the player is pro rated according to the number of matches for which he is available. However, while calculating the team’s salary cap, the player’s full season salary will be counted.

For example, if Sangakkara were to participate in the auction, and win a salary of Rs.5 Crore. Now, if he is available for only 40% of his team’s games, he would be paid Rs. 2 Crore. However, when his team’s total salary is determined, the full amount of Rs 5 Crore is taken into account.

Assuming that the salary cap is the real reason as to why teams don’t bid too much for a player (as opposed to capacity to pay), teams will not want to let go of a large amount of their salary cap for a player who is unlikely to be available for the full tournament. Thus, if Sangakkara were to enter the IPL Auction this year, he is likely to be undervalued, and hence he has decided to not take part in the IPL at all.

What Sangakkara is betting on is that in the auction teams will have a short-term perspective, and will be looking at only this year’s commitments in order to determine a player’s availability . Ideally, since the auctions are for purchase of a player for three years, teams should be taking into account the tours scheduled for the IPL seasons of those years (the gap in India’s schedule will show when the IPL will take place, and a player belonging to any country that has cricket scheduled at that time according to this chart will not be available for the IPL). However, perhaps due to the uncertainty in next year’s schedules (thanks to the proposed ICC revamp), teams are only going to take into account this year’s commitments in order to guide their bidding.

Sangakkara has said that he plans to take part in next year’s IPL, and he hopes for a much better valuation then compared to this year, for he will be free of international commitments. Given that the salary cap for the teams increases by only 5% (Rs. 3 Crore) next year, what he will be banking on is that teams might release some high value players they will be employing this year.

Tailpiece: Given that the English domestic calendar invariably clashes with the IPL, English Test players are going to be forever undervalued in the IPL. At least they should be if teams are intelligent about their bidding.

Tailpiece2: Samit Patel and Alex Hales have a deal with their county Nottinghamshire that they will be allowed to play in the IPL only if they can get a fee of USD 400000 (INR 2.5 Crore). They have both put their base prices as Rs. 2 Crore. It will be interesting to see if and how teams go about picking them!


Trading and liquidity

Every time there is some activity in the football transfer market, you are likely to hear one of two things. Either a particular player was “a steal” or the buyer “overpaid”. You seldom hear that a player was bought or sold at a “fair price”. What drives this?

Note that the issue is not perception – if you look at the transfer dealings, you are likely to find that the general opinion of whether the transfer fee was too high or too low is in most cases fairly accurate. Even if it is not accurate at the time of the transfer, it gets borne out in the subsequent year or two after sale.

Two weeks back I took a class in introductory economics for a bunch of people who hope to get elected to the Bangalore Municipal Council (BBMP). Teaching them about demand and supply, and trade, I mentioned that in any voluntary trade, both the buyer and the seller are “winners”. For example, if Liverpool sold Fernando Torres to Chelsea for GBP 50 million, it means two things: One, the value that Liverpool placed on the future contribution of Torres to the club was less than GBP 50 million. Two, the value that Chelsea placed on the future contribution of Torres was more than GBP 50 million. If either of the above conditions were not true, the deal would not have happened.

So why is it that football transfers usually end up costing too much or too little? The answer lies in “liquidity”. Liquidity is a concept that is normally used in financial markets as a measure of the depth of the market. It measures how many people are willing to buy and sell a particular commodity at a particular point in time. The theory is that the greater the number of buyers and sellers for a particular commodity, the better is the price discovery. I’ve said this several times before – it is unfortunate that the concept of liquidity doesn’t find as much traction in mainstream economics literature.

Coming back to football – why is it that players are typically either undervalued or over valued? Because players are unique, and that makes the market illiquid. Let us go back to the deal that took Torres to Chelsea. Let us say that the value Chelsea placed on his future services was GBP 50 million, and the value that Liverpool placed on his future services was GBP 35 million (numbers pulled out of thin air). Given that Liverpool owned him, this deal could have taken place at any value between these two numbers (note that at any price between 35 and 50 million, both Liverpool and Chelsea would be willing to trade)! So why did the deal take place at one end of the spectrum?

It was a consequence of how badly the two clubs wanted to do the deal. While Torres had lost form and hadn’t been performing in the 2010-11 season, Liverpool were quite happy holding on to him – they were not desperate to do the deal. Even when offered an amount higher than their valuation of the player, they sensed Chelsea’s desperation in doing the deal. So Liverpool’s game here was to hold on long enough until they knew Chelsea had bid an amount they were unlikely to improve on, and then they sold.

Sometimes fans like to sing something like “there is only one Fernando Torres” (typically when he scores). And that is the precise reason that Liverpool was able to get a premium on his sale. There was a certain kind of player whom Chelsea desperately wanted to buy, and Torres was the one who fit the bill perfectly. Given the lack of comparables, and the desperation of the buyer, it became a seller’s market and Liverpool were able to profit from it.

So we have seen here that when the buyer is more desperate to do the deal than the seller, the deal takes place at the higher end of the “value spectrum” (I just made up that phrase at this moment). It can go the other way also. When Liverpool sold Torres, they (rather unwisely) invested most of it buying a player called Andy Carroll from Newcastle United. Carroll turned out to be a dud – he was increasingly injury prone, and when a new manager Brendan Rodgers came in, he found him to be not suitable for the style of football Liverpool wanted to play.

The presence of Carroll in the squad, however, would put pressure on the manager to play him – largely a consequence of the fee that had been paid to purchase him. To this end, Rodgers decided that it was better to cut his losses and remove Carroll from the squad, rather than play a suboptimal brand of football just so that Carroll was played. Rodgers correctly decided that the money that had been spent in buying Carroll was a “sunk cost”.

Now, in his year and a half since his arrival at Liverpool, Carroll had done much to convince people that he was overvalued. His injuries and lack of form meant that clubs were unwilling to value him highly, and given Liverpool’s determination to sell, it was a seller’s market. The GBP 15 million that Liverpool extracted from West Ham for the sale was perhaps exactly the value that Liverpool had placed on Carroll.

To summarize – you sell if the price is higher than your valuation. You buy if the price is lower than your valuation. The buyer’s and seller’s valuations together determine the “value spectrum” along which a sale can be done. Presence of comparable commodities means that people can go for substitutes, and so that shrinks the value spectrum. In case of footballers with few comparables, there are no factors compressing the value spectrum, and the full extent of it is available.

In a large number of cases, one of the buyer and seller is much more desperate to do a particular deal than the other. And that pushes the price of the deal to one of the edges of the value spectrum. Hence people end up either significantly underpaying or significantly overpaying for footballers.

Who should the IPL franchises retain?

I have a proprietary algorithm for evaluating cricket matches. This algorithm analyzes matches ball-by-ball and then computes the “impact” of each player on the game, in terms of both batting and bowling.

I’ve been intending to do this for a while now but I finally got down to calculating the impact of different players in the past editions of the IPL, and who it makes sense for franchises to retain (incidentally, today is the last day for franchises to announce to the IPL who the players are who they are going to retain).

Let us go franchise by franchise and see who the best players are. The numbers in the brackets represent the impact of each player according to my proprietary system.

1. Chennai Super Kings

By a long way, their two best players are MS Dhoni (3.53) and Ravindra Jadeja (3.46). Interestingly, the primary reason for the latter’s high score is his batting  (2.86)- he has been bowling well, too (0.6), but it is his batting that has had significant impact.

These two are followed some distance behind by Raina (2.02) and the now retired Mike Hussey (1.75). Ashwin is some way behind at 0.7 (his bowling is at 1 and batting at 0.33; the algorithm tends to unfairly penalize bowlers for their batting abilities, or the lack of it).

Chennai have already made their decision on who to retain. They are going to retain Dhoni, Jadeja, Raina, Ashwin and Dwayne Bravo. The last is a bit of a puzzle, at -1.09. His batting has been excellent – he has contributed 1.52 but his bowling has been utter crap at 2.61. CSK would do well to use him as a batsman only

2. Delhi Daredevils

This is a team that has performed rather badly in the last bunch of IPLs, so they might be expected to dispense with some players. Virender Sehwag (3.14), though, has performed exceptionally in the rot, though this season’s domestic performance (or the lack of it) might go against him. Next is the injury-prone Irfan Pathan (1.72). Shahbaz Nadeem is a surprise package at 1.56. I wouldn’t expect them to retain anyone.

Umesh Yadav (-1.77) and Mahela Jayawardene (-2.33) have been especially poor performers

3. Kings XI Punjab

Another franchise that didn’t do particularly well in the last set of IPLs. David Miller (2.05) was their standout performer, followed by Gurkeerat Singh (1.24), Shaun Marsh (1.11) and Praveen Kumar (1.02). The latter two are highly injury prone and they may not want to part with a large part of their budget for the yet uncapped Gurkeerat. So if you expect them to retain any players, it would only be Miller.

At the other end, Parvinder Awana (-1.92) has been the standout performer.

4. Kolkata Knight Riders

Sunil Narine (4.48) and Gautam Gambhir (4.22) tower over the rest. Following them are Shakib al Hasan (1.63) and Iqbal Abdulla (1.13). One would expect them to hold on to the first two (Narine and Gambhir) and try to use their trump card to match a price for Shakib.

Jacques Kallis performed particularly badly (-2.81) and is unlikely to be retained.

5. Mumbai Indians

If you were to rank all players in descending order of impact, the standout player across teams would be Harbhajan Singh (5.04; 3.64 bowling, 1.41 batting). Despite his axing from the national team, one would expect him to be retained by the franchise. He is followed some way behind by Lasith Malinga (2.01), Kieron Pollard (1.97; with 3.05 in batting and – 1.09 in bowling) and Rohit Sharma (1.74). One would expect all of those three to be retained. Dinesh Karthik at 1.31 might also be retained, for they will only need to give up Rs. 4 Crore from their salary cap  to get him.

6. Rajasthan Royals

If one goes by the gossip, the Royals are expected to retain a large number of players. They are the “moneyball” team of the IPL. They don’t spend too much on salary but try to get otherwise undervalued players to play for them.

Brad Hodge (1.91) has been their star performer but his age might go against him – they might prefer to match him using their trump card. They are expected to retain Shane Watson (1.55 with 3.83 batting and -2.28 bowling), though. Stuart Binny at 1.34 is also a good bet to be retained.

Interestingly, the system shows a negative impact for the otherwise highly rated Sanju Samson (-0.17)! He is, however, another player they might retain.

7. Royal Challengers Bangalore

The Royal Challengers have already made their decision – they will retain Chris Gayle (4.93; with 6.51 batting and -1.58 bowling), AB de Villiers (3.12) and Virat Kohli (1.95 with 2.22 batting and -0.27 bowling). The one highly rated player they are not retaining is Zaheer Khan (3.69). Khan has been exceptional considering that his partners in the RCB pace attack are Vinay Kumar (-3.59), RP Singh (-2.83) and Abhimanyu Mithun (-1.69).

Their only other highly rated bowler is Murali Kartik (1.05). They will need to completely rebuild their bowling attack in order to compete this IPL

8. Sunrisers Hyderabad

Dale Steyn (3.43) is the standout performer and they would do well to retain him. The next best is Shikhar Dhawan, who is some distance away at 0.72. Given the paucity of quality Indian players, though, they might end up retaining Dhawan also.

I’m willing to share the full results of my analysis. Do reach out to me if you want to play around with it and I’ll send it to you. And let me know what you think of these ratings.