Shorting and efficient markets

The trigger for this blogpost is a tweet by my favourite newsletter-writer Matt Levine. He wrote:

And followed up with:

I responded to him with a tweet but thought this is blogworthy so putting it here.

The essential difference between an iPhone and an Apple share is that in the short run, the supply of the former is constantly increasing while the supply of the latter is fixed (follow on offers, stock splits, etc., which increase the supply of shares, are rare events).

The difference occurs because the “default action” (which is “do nothing” – caused due to inertia) has different impacts on the two market structures. In a market with constantly increasing supply, if customers “do nothing”, there will soon be a supply-demand mismatch and the manufacturer will have to take corrective action.

When the supply of the commodity is fixed, on the other hand (like an Apple share), the default action (“do nothing”) has no impact on the prices. You need a stronger method to express your displeasure, and that method is the ability to short the stock.

Discrete and continuous diseases

Some three years or so back I got diagnosed with ADHD, and put on a course of Methylphenidate. The drug worked, made me feel significantly better and more productive, and I was happy that a problem that should have been diagnosed at least a decade earlier had finally been diagnosed.

Yet, there were people telling me that there was nothing particularly wrong with me, and how everyone goes through what are the common symptoms of ADHD. It is a fact that if you go through the ADHD questionnaire (not linking to it here), there is a high probability of error of commission. If you believer you have it, you can will yourself into answering such that the test indicates that you have it.

Combine this with the claim that there is heavy error of commission in terms of diagnosis and drugging (claims are that some 10% of American kids are on Methylphenidate) and it can spook you, and question if your diagnosis is correct. It doesn’t help matters that there is no objective diagnostic test to detect ADHD.

And then your read articles such as this one, which talks about ADHD in kids in Mumbai. And this spooks you out from the other direction. Looking at some of the cases mentioned here, you realise yours is nowhere as bad, and you start wondering if you suffer from the same condition as some of the people mentioned in the piece.

The thing with a condition such as ADHD is that it is a “continuous” disease, in that it occurs in different people to varying degrees. So if you ask a question like “does this person have ADHD” it is very hard to give a straightforward binary answer, because by some definitions, “everyone has ADHD” and by some others, where you compare people to the likes of the girl mentioned in the Mid-day piece (linked above), practically no one has ADHD.

Treatment also differs accordingly. Back when I was taking the medication, I used to take about 10mg of Methylphenidate per day. A friend, who is also on Methylphenidate and of a comparable dosage, informs me that there are people who are on the same drug at a dosage that is several orders of magnitude higher. In that sense, the medical profession has figured out the continuous nature of the problem and learnt to treat it accordingly (a “bug”, however, is that it is hard to determine optimal dosage first up, and it is done through a trial and error process).

The problem is that we are used to binary classification of conditions – you either have a cold or you don’t. You have a fever or you don’t (though arguably once you have a fever, you can have a fever to different degrees). You have typhoid or you don’t. And so forth.

So coming from this binary prior of classifying diseases, continuous diseases such as ADHD are hard to fathom for some people. And that leads to claims of both over and under medication, and it makes clinical research also pretty hard.

Do I have ADHD? Again it’s hard to give a binary answer to that. It depends on where you want to draw the line.

Means, medians and power laws

Following the disbursement of Rs. 10 lakh by the Andhra Pradesh government for the family of each victim killed in the stampede on the Godavari last week, we did a small exercise to put a value on the life of an average Indian.

The exercise itself is rather simple – you divide India’s GDP by its population to get the average productivity (this comes out to Rs. 1 lakh). The average Indian is now 29 and expected to live to 66 (another 37 years). Assume a nominal GDP growth rate of 12%, annual population increase of 2%  and a cost of capital of 8% (long term bond yield) and you value the average Indian life at 52 lakhs.

People thought that the amount the AP government disbursed itself was on the higher side, yet we have come up with a higher number. The question is if our calculation is accurate.

We came up with the Rs. 1 lakh per head figure by taking the arithmetic mean of the productivity of all Indians. The question is if that is the right estimate.

Now, it is a well established fact that income and wealth follow a power law distribution. In fact, Vilfredo Pareto came up with his “Pareto distribution” (the “80-20 rule” as some people term it) precisely to describe the distribution of wealth. In other words, some people earn (let’s stick to income here) amounts that are several orders of magnitude higher than what the average earns.

A couple of years someone did an analysis (I don’t know where they got the data) and concluded that a household earning Rs. 12 lakh a year is in the “top 1%” of the Indian population by income. Yet, if you talk to a family earning Rs. 12 lakh per year, they will most definitely describe themselves as “middle class”.

The reason for this description is that though these people earn a fair amount, among people who earn more than them there are people who earn a lot more.

Coming back, if income follows a power law distribution, are we still correct in using the mean income to calculate the value of a life? It depends on how we frame the question. If we ask “what is the average value of an Indian life” we need to use mean. If we ask “what is the value of an average Indian life” we use median.

And for the purpose of awarding compensation after a tragedy, the compensation amount should be based on the value of the average Indian life. Since incomes follow a Power Law distribution, so does the value of lives, and it is not hard to see that average of a power law can be skewed by numbers in one extreme.

For that reason, a more “true” average is one that is more representative of the population, and there is no better metric for this than the median (other alternatives are “mean after knocking off top X%” types, and they are arbitrary). In other words, compensation needs to be paid based on the “value of the average life”.

The problem with median income is that it is tricky to calculate, unlike the mean which is straightforward. No good estimates of the median exist, for we need to rely on surveys for this. Yet, if we look around with a cursory google search, the numbers that are thrown up are in the Rs. 30000 to Rs. 50000 range (and these are numbers from different time periods in the past). Bringing forward older numbers, we can assume that the median per capita income is about Rs. 50000, or half the mean per capita income.

Considering that the average Indian earns Rs. 50000 per year, how do we value his life? There are various ways to do this. The first is to do a discounted cash flow of all future earnings. Assuming nominal GDP growth of about 12% per year, population growth 2% per year and long-term bond yield of 8%, and that the average Indian has another 37 years to live (66 – 29), we value the life at Rs. 26 lakh.

The other way to value the life is based on “comparables”. The Nifty (India’s premier stock index) has a Price to Earnings ratio of about 24. We could apply that on the Indian life, and that values the average life at Rs. 12 lakh.

Then, there are insurance guidelines. It is normally assumed that one should insure oneself up to about 7 times one’s annual income. And that means we should insure the average Indian at Rs. 3.5 lakh (the Pradhan Mantri Suraksha Bima Yojana provides insurance of Rs. 2 lakhs).

When I did a course on valuations a decade ago, the first thing the professor taught us was that “valuation is always wrong”. Based on the numbers above, you can decide for yourself if the Rs. 10 lakh amount offered by the AP government is appropriate.

 

Vistara and Indigo

Earlier today the Air Traffic Controller of Bangalore tweeted that Air Vistara had a 100% on time performance in Bangalore.

My immediate reaction was that this was because Air Vistara is positioned as a premium service, and hence their schedule is more “sparse” and has greater “slack”. That, I mentioned, has a direct consequence on their on-time performance.

The Directorate General of Civil Aviation puts out monthly reports on the performance of airlines in India. The data they dispense is very interesting, but the format is horrible. It’s a PDF embedded into a 20th century web page. If you can parse the above link there are a number of insights to be gleaned.

Firstly, a full 63% of flight delays in India (for the month of June) have been classified as “reactionary” (not cutting and pasting the image here because I don’t want to desecrate this blog by putting a pie chart on it). This is what airport announcers term as “delay caused due to delay in incoming aircraft”.

In other words, what happens is that airlines try to over-optimise their schedules too much leaving little slack between two consecutive flights for a particular aircraft. And so any delay in any flight cascades through the length of the day for that particular aircraft. My hypothesis (haven’t found data to back this up) is that Vistara has a more relaxed schedule than other airlines and hence has better on-time performance.

It is also pertinent to mention that Vistara has a much lower passenger load factor compared to other airlines. The average Vistara flight in June was only about 60% full, comfortably putting it in last place. Perhaps the premium pricing hasn’t been attracting the kind of passengers as hoped for. Or they’re not marketing well to the right kind of people.

The other airline which merits mention here is Indigo, which seems to be literally running away with the market. Not only is it comfortably number 1 with a consistent 37% market share, it also has the lowest proportion of cancelled flights, a pretty high passenger load factor (86%) and better on-time performance than any of the other large airlines.

Airlines is an industry where there are significant positive feedbacks – if you are on time, not only do more people want to fly you but you can also have a more efficient schedule. And so forth. And there are definite economies of scale in maintenance and schedule density and so forth. Indigo is taking advantage of all of those.

It may not be a particularly profitable industry, but the airline industry is surely interesting to watch!

No Chillr, Go Ahead

This is yet another “delayed post” – one that I thought up some two weeks back but am getting down to write only now. 

After some posts that I’ve done recently on the payments system, I decided to check out some of the payment apps, and installed Chillr. This was recommended to me by a friend who has a HDFC Bank account, who told me that the app is now widely used in his office to settle bills among people, etc. Since I too have an account with that bank, I was able to install it.

The thing with Chillr is that currently they are tied up with only HDFC Bank. You can still sign on if you have an account of another bank, but in that case you can only receive (and not send) money through the system. So your incentive for installing is limited.

Installation is not very straightforward since you have to enter some details from your netbanking which are not “usual” things. One is a password that allows you to receive money using the app, and the other is a password that allows you to send money. Both are generated by the bank and sent to your phone as an SMS which the app automatically reads. I understand this is part of the system itself and this part won’t go away irrespective of the app you use.

Once you have installed it, you will then be able to use the app to transfer money to your contacts who are also on the app without requiring to know their account number. The payment process is extremely smooth with an easy to use second factor of authentication (a PIN that you have set for the app, so it is instant), so if more people use it, it can ease a large number of payments, including small payments.

The problem, though, is that it is currently in a “walled garden” in that only customers of HDFC Bank can send money, and hence the uptake of the app is limited. The app allows you to see who on your contact list is there on the app (since that is the universe to which you can send money using the app). The last time I checked, there were four people on the list. One was the guy who recommended me the app, the second was another friend who works in the same organisation as this guy, the third a guy who works closely with banks and the fourth a Venture Capitalist. And my phonebook runs into the high hundreds at least.

In terms of technology, the app is based on the IMPS platform which means that in terms of technology there is nothing that prevents the app from transferring money across banks using its current level of authentication. This is very good news, since it means that once banks are signed on, it is a seamless integration and there are no technological barriers to payment.

The problem, however, is that the sector suffers from the “2ab problem” (read my  argument in favour of net neutrality using the 2ab framework). Different tech companies are signing on different banks (Chillr to HDFC; Ping Pay to Axis; etc.) and such banks will be loathe to sign on multiple tech companies (possibly due to integration issues; possibly due to no compete clauses).

Currently, if HDFC Bank has a users and Axis Bank has b users, and they use Chillr and Ping Pay respectively, the total value added to the system by both Chillr and Ping Pay is proportional to a^2 + b^2 (network effects, Metcalfe’s law and all that). But if these companies merge, or one of them gets the account of the other’s bank, then you have a single system with a+b users, and the value added to the system by the combined payments entity is (a+b)^2 which is a^2 + b^2 + 2 ab. Currently the sector is missing the 2ab. The good news, however, is that there are no tech barriers to inter-bank payments.

Postscript: The title is a direct translation of a popular and perhaps derogatory Kannada phrase.

Startup bragging and exaggeration rights

It seems to be common knowledge that startups like to exaggerate their results when they talk to the media. While I’ve known this for a long time, I was rather startled to see the numbers put out by a company I know, which seems to be an order of magnitude larger than what is actually the case. And when I was discussing with someone else in the know, I was told that this degree of overstating (especially to the media) is a common thing in the startup world.

In “normal” companies, overstatement of numbers is a massive crime, and shareholders can prosecute the management for such activities. Yet, it seems like investors in startups (funded startups seem to do this all the time) don’t seem to mind this at all. What is the difference?

“Normal” steady-state companies usually don’t have to raise capital too often. After they’ve raised a certain amount, hit steady state and gone public, raising more capital is a rare event. Also that they are public means that you have “gullible households” who own equity, and investor protection laws mean that they need to state incomes and other financial information to the best of their knowledge, and any cooking of the books can lead to prosecution.

For a startup, on the other hand, raising capital is a “normal” (as opposed to “extraordinary”) event, and its investors are mostly sophisticated investors (apart from gullible employees who have been forced to take equity for “skin in the game”). By overstating its numbers, especially in the popular media (hopefully now with the Registrar of Companies), startups can hope to create greater buzz which increases the likelihood of getting a next round of investment at a higher valuation.

Notice that in this case investors are also okay with the books having been cooked since they aren’t playing the dividend game but have a short term goal of raising more funds at higher valuations. And if overstating numbers can help that, so be it!

Genetic Algorithms

I first learnt about Genetic Algorithms almost exactly thirteen years ago, when it was taught to me by Prof. Deepak Khemani as part of a course on “artificial intelligence”. I remember liking the course a fair bit, and took a liking to the heuristics and “approximate solutions” after the mathematically intensive algorithms course of the previous semester.

The problem with the course, however, was that it didn’t require us to code the algorithms we had learnt (for which we were extremely thankful back then, since in term 5 of Computer Science at IIT Madras, this was pretty much the only course that didn’t involve too many programming assignments).

As a result, while I had learnt all these heuristics (hill climbing, simulated annealing, taboo search, genetic algorithms, etc.) fairly well in theory, I had been at a complete loss as to how to implement any of them. And so far, during the course of my work, I had never had an opportunity to use any of these techniques. Until today that is.

I can’t describe the problem here since this is a client assignment, but when I had to pick a subset from a large set that satisfied certain properties, I knew I needed a method that would reach the best subset quickly. A “fitness function” quickly came to mind and it was obvious that I should use genetic algorithms to solve the problem.

The key with using genetic algorithms is that you need to be able to code the solution in the form of a string, and then define functions such as “crossover” and “mutation”. Given that I was looking for a subset, coding it as a string was rather easy, and since I had unordered subsets, the crossover was also easy – basic random number generation. Within fifteen minutes of deciding I should use GA, the entire algorithm was in my head. It was only a question of implementing it.

As I started writing the code, I started getting fantasies of being able to finish it in an hour and then write a blog post about it. As it happened, it took much longer. The first cut took some three hours (including some breaks), and it wasn’t particularly good, and was slow to converge.  I tweaked things around a bit but things didn’t improve by much.

And that was when I realise that I had done the crossover wrong – when two strings have elements in common and need to be crossed over, I had to take care that elements in common did not repeat into the same “child” (needed the subsets to be of a certain length). So that needed some twist in the code. That done, the code still seemed inefficient.

I had been doing the crossover wrong. If I started off with 10 strings, I would form 5 pairs from them (each participating in exactly one pair) which would result in 10 new strings. And then I would put these 20 (original 10 and new 10) strings through a fitness test and discard the weakest 10. And iterate. The problem was that the strongest strings had as much of a chance of reproducing as the weakest. This was clearly not right.

So I tweaked the code so that the fitter strings had a higher chance of reproducing than the less fit. This required me to put the fitness test at the beginning of each iteration rather than the end. I had to refactor the code a little bit to make sure I didn’t repeat computations. Now I was drawing pairs of strings from the original “basket” and randomly crossing them over. And putting them through the fitness test. And so forth.

I’m extremely happy with the results of the algorithm. I’ve got just the kind of output that I had expected. More importantly, I was extremely happy with the process of coding the whole thing in. I did the entire coding in R, which is what I use for my data analysis (data size meant I didn’t need anything quicker).

The more interesting part is that this only solved a very small part of the problem I’m trying to solve for my client. Tomorrow I’m back to solving a different part of the problem. Genetic algorithms have served their purpose. Back when I started this assignment I had no clue I would be using genetic algorithms. In fact, I had no clue what techniques I might use.

Which is why I get annoyed when people ask me what kind of techniques I use in my problem solving. Given the kind of problems I take on, most will involve a large variety of math, CS and statistics techniques, each of which will only play a small role in the entire solution. This is also the reason I get annoyed when people put methods they are going to use to solve the problem on their pitch-decks. To me, that gives an impression that they are solving a toy version of the problem and not the real problem – or that the consultants are grossly oversimplifying the problem to be solved.

PS: Much as some people might describe it that way, I wouldn’t describe Genetic Algorithms as “machine learning”. I think there’s way too much intervention on the part of the programmer for it to be described thus.