Ratings revisited

Sometimes I get a bit narcissistic, and check how my book is doing. I log on to the seller portal to see how many copies have been sold. I go to the Amazon page and see what are the other books that people who have bought my book are buying (on the US store it’s Ray Dalio’s Principles, as of now. On the UK and India stores, Sidin’s Bombay Fever is the beer to my book’s diapers).

And then I check if there are new reviews of my book. When friends write them, they notify me, so it’s easy to track. What I discover when I visit my Amazon page are the reviews written by people I don’t know. And so far, most of them have been good.

So today was one of those narcissistic days, and I was initially a bit disappointed to see a new four-star review. I started wondering what this person found wrong with my book. And then I read through the review and found it to be wholly positive.

A quick conversation with the wife followed, and she pointed out that this reviewer perhaps reserves five stars for the exceptional. And then my mind went back to this topic that I’d blogged about way back in 2015 – about rating systems.

The “4.8” score that Amazon gives as an average of all the ratings on my book so far is a rather crude measure – since one reviewer’s 4* rating might differ significantly from another reviewer’s.

For example, my “default rating” for a book might be 5/5, with 4/5 reserved for books I don’t like and 3/5 for atrocious books. On the other hand, you might use the “full scale” and use 3/5 as your average rating, giving 4 for books you really like and very rarely giving a 5.

By simply taking an arithmetic average of ratings, it is possible to overstate the quality of a product that has for whatever reason been rated mostly by people with high default ratings (such a correlation is plausible). Similarly a low average rating for a product might mask the fact that it was rated by people who inherently give low ratings.

As I argue in the penultimate chapter of my book (or maybe the chapter before that – it’s been a while since I finished it), one way that platforms foster transactions is by increasing information flow between the buyer and the seller (this is one thing I’ve gotten good at – plugging my book’s name in random sentences), and one way to do this is by sharing reviews and ratings.

From this perspective, for a platform’s judgment on a product or seller (usually it’s the seller, but for products such as AirBnb, information about buyers also matters) to be credible, it is important that they be aggregated in the right manner.

One way to do this is to use some kind of a Z-score (relative to other ratings that the rater has given) and then come up with a normalised rating. But then this needs to be readjusted for the quality of the other items that this rater has rated. So you can think of some kind of a Singular Value Decomposition you can perform on ratings to find out the “true value” of a product (ok this is an achievement – using a linear algebra reference given how badly I suck in the topic).

I mean – it need not be THAT complicated, but the basic point is that it is important that platforms aggregate ratings in the right manner in order to convey accurate information about counterparties.

The (missing) Desk Quants of Main Street

A long time ago, I’d written about my experience as a Quant at an investment bank, and about how banks like mine were sitting on a pile of risk that could blow up any time soon.

There were two problems as I had documented then. Firstly, most quants I interacted with seemed to be solving maths problems rather than finance problems, not bothering if their models would stand the test of markets. Secondly, there was an element of groupthink, as quant teams were largely homogeneous and it was hard to progress while holding contrarian views.

Six years on, there has been no blowup, and in some sense banks are actually doing well (I mean, they’ve declined compared to the time just before the 2008 financial crisis but haven’t done that badly). There have been no real quant disasters (yes I know the Gaussian Copula gained infamy during the 2008 crisis, but I’m talking about a period after that crisis).

There can be many explanations regarding how banks have not had any quant blow-ups despite quants solving for math problems and all thinking alike, but the one I’m partial to is the presence of a “middle layer”.

Most of the quants I interacted with were “core” in the sense that they were not attached to any sales or trading desks. Banks also typically had a large cadre of “desk quants” who are directly associated with trading teams, and who build models and help with day-to-day risk management, pricing, etc.

Since these desk quants work closely with the business, they turn out to be much more pragmatic than the core quants – they have a good understanding of the market and use the models more as guiding principles than as rules. On the other hand, they bring the benefits of quantitative models (and work of the core quants) into day-to-day business.

Back during the financial crisis, I’d jokingly predicted that other industries should hire quants who were now surplus to Wall Street. Around the same time, DJ Patil et al came up with the concept of the “data scientist” and called it the “sexiest job of the 21st century”.

And so other industries started getting their own share of quants, or “data scientists” as they were now called. Nowadays its fashionable even for small companies for whom data is not critical for business to have a data science team. Being in this profession now (I loathe calling myself a “data scientist” – prefer to say “quant” or “analytics”), I’ve come across quite a few of those.

The problem I see with “data science” on “Main Street” (this phrase gained currency during the financial crisis as the opposite of Wall Street, in that it referred to “normal” businesses) is that it lacks the cadre of desk quants. Most data scientists are highly technical people who don’t necessarily have an understanding of the business they operate in.

Thanks to that, what I’ve noticed is that in most cases there is a chasm between the data scientists and the business, since they are unable to talk in a common language. As I’m prone to saying, this can go two ways – the business guys can either assume that the data science guys are geniuses and take their word for the gospel, or the business guys can totally disregard the data scientists as people who do some esoteric math and don’t really understand the world. In either case, value added is suboptimal.

It is not hard to understand why “Main Street” doesn’t have a cadre of desk quants – it’s because of the way the data science industry has evolved. Quant at investment banks has evolved over a long period of time – the Black-Scholes equation was proposed in the early 1970s. So the quants were first recruited to directly work with the traders, and core quants (at the banks that have them) were a later addition when banks realised that some quant functions could be centralised.

On the other hand, the whole “data science” growth has been rather sudden. The volume of data, cheap incrementally available cloud storage, easy processing and the popularity of the phrase “data science” have all increased well-at-a-faster rate in the last decade or so, and so companies have scrambled to set up data teams. There has simply been no time to train people who get both the business and data – and the data scientists exist like addendums that are either worshipped or ignored.

Levels and shifts in analysing games

So Nitin Pai and Pranay Kotasthane have a great graphic on how India should react to China’s aggressions on Doka La. While the analysis is excellent, my discomfort is with the choice of “deltas” as the axes of this payoff diagram, rather than levels.

Source: Nitin Pai and Pranay Kotasthane

Instead, what might have been preferable would have been to define each countries strategies in terms of levels of aggressions, define their current levels of aggression, and evaluate the two countries’ strategies in terms of moving to each possible alternate level. Here is why.

The problem with using shifts (or “deltas” or “slopes” or whatever you call the movement between levels) is that they are not consistent. Putting it mathematically, the tangent doesn’t measure the rate of change in a curve when you go far away from the point where you’ve calibrated the tangent.

To illustrate, let’s use this diagram itself. The strategy is that India should “hold”. From the diagram, if India holds, China’s best option is to escalate. In the next iteration, India continues to hold, and China continues to escalate. After a few such steps, surely we will be far away enough from the current equilibrium that the payoff for changing stance is very different from what is represented by this diagram?

This graph is perhaps valid for the current situation where (say) India’s aggression level is at 2 on a 1–5 integer scale, while China is at 3. But will the payoffs of going up and down by a notch be the same if India is still at 2 and China has reached the maximum pre-war aggression of 5 (remember that both are nuclear powers)?

On the flip side, the good thing about using payoffs based on changes in level is that it keeps the payoff diagram small, and this is especially useful when the levels cannot be easily discretised or there are too many possible levels. Think of a 5×5 square graph, or even a 10×10, in place of the 3×3, for example?—?soon it can get rather unwieldy. That is possibly what led Nitin and Pranay to choose the delta graph.

Mirrored here.

Newsletter!

So after much deliberation and procrastination, I’ve finally started a newsletter. I call it “the art of data science” and the title should be self-explanatory. It’s pure unbridled opinion (the kind of which usually goes on this blog), except that I only write about one topic there.

I intend to have three sections and then a “chart of the edition” (note how cleverly I’ve named this section to avoid giving much away on the frequency of the newsletter!). This edition, though, I ended up putting too much harikathe, so I restricted to two sections before the chart.

I intend to talk a bit each edition about some philosophical part of dealing with data (this section got a miss this time), a bit on data analysis methods (I went a bit meta on this this time) and a little bit on programming languages (which I used for bitching a bit).

And that I plan to put a “chart of the edition” means I need to read newspapers a lot more, since you are much more likely to find gems (in either direction) there than elsewhere. For the first edition, I picked off a good graph I’d seen on Twitter, and it’s about Hull City!

Anyway, enough of this meta-harikathe. You can read the first edition of the newsletter here. In case you want to get it in your inbox each week/fortnight/whenever I decide to write it, then subscribe here!

And put feedback (by email, not comments here) on what you think of the newsletter!

Using all available information

In “real-life” problems, it is not necessary to use all the given data. 

My mind goes back eleven years, to the first exam in the Quantitative Methods course at IIMB. The exam contained a monster probability problem. It was so monstrous that only some two or three out of my batch of 180 could solve it. And it was monstrous because it required you to use every given piece of information (most people missed out the “X and Y are independent” statement, since this bit of information was in words, while everything else was in numbers).

In school, you get used to solving problems where you are required to use all the given information and only the given information to solve the given problem. Taken out of the school setting, however, this is not true any more. Sometimes in “real life”, you have problems where next to no data is available, for which you need to make assumptions (hopefully intelligent) and solve the problem.

And there are times  in “real life” when you are flooded with so much data that a large part of the problem solving process is in the identification of what data is actually relevant and what you can ignore. And it can often happen that different pieces of given information contradict each other and deciding upon what to use and what to ignore is critical to efficient solution, and the decision is an art form.

Yet, in the past I’ve observed that people are not happy when you don’t use all the information at your disposal. The general feeling is that ignoring information leads to a suboptimal model – one which could be bettered by including the additional information. There are several reasons, though, that one might choose to leave out information while solving a real-life problem:

  • Some pieces of available information are mutually contradictory, so taking them both into account will lead to no solution.
  • A piece of data may not add any value after taking into account the other data at hand
  • The incremental impact of a particular piece of information is so marginal that you don’t lose much by ignoring it
  • Making use of all available information can lead to increased complexity in the model, and the incremental impact of the information may not warrant this complexity
  • It might be possible to use established models if you were to use part of the information. So we lose precision for a known model. Not always recommended but done.

The important takeaway, though, is that knowing what information to use is an art, and this forms a massive difference between textbook problems and real-life problems.

Means, medians and power laws

Following the disbursement of Rs. 10 lakh by the Andhra Pradesh government for the family of each victim killed in the stampede on the Godavari last week, we did a small exercise to put a value on the life of an average Indian.

The exercise itself is rather simple – you divide India’s GDP by its population to get the average productivity (this comes out to Rs. 1 lakh). The average Indian is now 29 and expected to live to 66 (another 37 years). Assume a nominal GDP growth rate of 12%, annual population increase of 2%  and a cost of capital of 8% (long term bond yield) and you value the average Indian life at 52 lakhs.

People thought that the amount the AP government disbursed itself was on the higher side, yet we have come up with a higher number. The question is if our calculation is accurate.

We came up with the Rs. 1 lakh per head figure by taking the arithmetic mean of the productivity of all Indians. The question is if that is the right estimate.

Now, it is a well established fact that income and wealth follow a power law distribution. In fact, Vilfredo Pareto came up with his “Pareto distribution” (the “80-20 rule” as some people term it) precisely to describe the distribution of wealth. In other words, some people earn (let’s stick to income here) amounts that are several orders of magnitude higher than what the average earns.

A couple of years someone did an analysis (I don’t know where they got the data) and concluded that a household earning Rs. 12 lakh a year is in the “top 1%” of the Indian population by income. Yet, if you talk to a family earning Rs. 12 lakh per year, they will most definitely describe themselves as “middle class”.

The reason for this description is that though these people earn a fair amount, among people who earn more than them there are people who earn a lot more.

Coming back, if income follows a power law distribution, are we still correct in using the mean income to calculate the value of a life? It depends on how we frame the question. If we ask “what is the average value of an Indian life” we need to use mean. If we ask “what is the value of an average Indian life” we use median.

And for the purpose of awarding compensation after a tragedy, the compensation amount should be based on the value of the average Indian life. Since incomes follow a Power Law distribution, so does the value of lives, and it is not hard to see that average of a power law can be skewed by numbers in one extreme.

For that reason, a more “true” average is one that is more representative of the population, and there is no better metric for this than the median (other alternatives are “mean after knocking off top X%” types, and they are arbitrary). In other words, compensation needs to be paid based on the “value of the average life”.

The problem with median income is that it is tricky to calculate, unlike the mean which is straightforward. No good estimates of the median exist, for we need to rely on surveys for this. Yet, if we look around with a cursory google search, the numbers that are thrown up are in the Rs. 30000 to Rs. 50000 range (and these are numbers from different time periods in the past). Bringing forward older numbers, we can assume that the median per capita income is about Rs. 50000, or half the mean per capita income.

Considering that the average Indian earns Rs. 50000 per year, how do we value his life? There are various ways to do this. The first is to do a discounted cash flow of all future earnings. Assuming nominal GDP growth of about 12% per year, population growth 2% per year and long-term bond yield of 8%, and that the average Indian has another 37 years to live (66 – 29), we value the life at Rs. 26 lakh.

The other way to value the life is based on “comparables”. The Nifty (India’s premier stock index) has a Price to Earnings ratio of about 24. We could apply that on the Indian life, and that values the average life at Rs. 12 lakh.

Then, there are insurance guidelines. It is normally assumed that one should insure oneself up to about 7 times one’s annual income. And that means we should insure the average Indian at Rs. 3.5 lakh (the Pradhan Mantri Suraksha Bima Yojana provides insurance of Rs. 2 lakhs).

When I did a course on valuations a decade ago, the first thing the professor taught us was that “valuation is always wrong”. Based on the numbers above, you can decide for yourself if the Rs. 10 lakh amount offered by the AP government is appropriate.

 

The Art of Drawing Spectacular Graphs

Bloomberg Business has a feature on the decline of the Euro after the Greek “No” vote last night. As you might expect, the feature is accompanied by a graphic which shows a “precipitous fall” in the European currency.

I’m in two minds of whether to screenshot the graphic (so that any further changes are not reflected), or to not plagiarise by simply putting a link (but exposing this post to the risk of becoming moot, if Bloomberg changes its graphs later on. It seems like the graphic on the site is a PNG, so let me go ahead and link to it:

You notice the spectacular drop right? Cliff-like. You think the Euro is doomed now that the Greeks have voted “no”? Do not despair, for all you need to do is to look at the axis, and the axis labels.

The “precipitous drop” that is indicated by the above graph indicates a movement of the EUR/USD from about 1.11 to about 1.10. Or a fall of 0.88%, as the text accompanying the graph says! And given how volatile the EUR/USD has been over the last couple of months (look at graph below), this is not that significant!

eurusd

 

I won’t accuse Bloomberg of dishonesty since they’ve clearly mentioned “0.88%”, but they sure know how to use graphics to propagate their message!