Indian Americans and the Selection Bias

There is this one chart from the Economist that has been doing its rounds over the interwebs over the last few days:

Basically it shows that Indian Americans are much more accomplished academically and professionally compared to other immigrants. And there are many theories floating around as to why Indians are so successful.

The answer, however, is rather simple – selection bias. Migrating from India to the US was an extremely difficult task till the 1960s – there were some quotas that the US had for immigration under which the Indians had nothing. And when Indians did finally start migrating in the 1960s, it was mostly for education.

Most people who migrated from India to the US even in the 1960s and 70s did so to go to graduate school. And this meant that they already had 16 years of education in India, which either meant an engineering or medical degree, or a masters in one of the other fields. So basically most Indians migrating to the US were highly accomplished already.

And considering the kind of foreign exchange controls imposed by the Indian government, the only Indians who could afford to go to the US for an education were those that received a fellowship or support from their universities. Thus increasing the seelection bias! (Now that I’ve mentioned foreign exchange controls, you should listen to this song, which was apparently meant to parody such policies)

Yes, you had the odd Patel without much education who made it to open a “Potel” (Patel run Motel), but that is probably the reason that the Indian bubble in the above chart is not farther out!

So that Indians have done better than other migrating communities in the US is not about innate Indian intelligence, or innate Indian ability to work hard, or because the Americans took in the Indians much better than other nationality. It is simple selection bias, based on tight immigration controls and tight emigration controls and stupid foreign exchange policy on the part of Indian government (which, at one point of time, only allowed citizens to take out eight dollars from the country).

To illustrate this point, look at the country that is “second” (quotes since we are looking at two dimensions here, so second is subjective) in this list – Iran.

Useless LinkedIn

I’m not a big fan of LinkedIn. I mean, I use it, and fairly regularly at that (check it at least once a day), and I think conceptually it’s quite useful. However, in practice, I think there are a number of sticking points about the service, which makes it quite useless.

For starters its apps (iPad and Android) are quite lousy, and offer nowhere close to the kind of experience that the web interface offers. Things are extremely unintuitive (down to the tabbing order – you compose message, hit tab and enter, and you don’t send the message. It takes you to the profile of the person you’re messaging instead) on the website. Sometimes the apps show notifications even after you’ve checked them on the web, and so on.

In other words it’s an extremely poorly engineered product, but which is surviving (and thriving) thanks to network effects!

I might have commented on this in the past but there is this thing on endorsements. This was something that coincided with the time when LinkedIn went public (if I’m not wrong), and you could endorse people for their “skills” on LinkedIn. For a while I played along with the game. But then I completely lost it when a distant uncle who I’m sure has never traded derivatives endorsed me for “derivatives”. I quickly deleted my skills.

Then there are the LinkedIn recommendations, which has inherent selection bias and hence adds no value. And then you have the “say goncrats” feature, where LinkedIn prompts you to “say congrats” on people changing jobs or hitting job anniversaries. I’ve found this mildly useful (dropping a note when someone switches jobs is a good way to stay in touch), but there are the bugs in terms ofjob downgrades and people getting fired.

And of late, there has been serious spam in terms of people’s status updates. I don’t know when it became popular to post silly puzzles on professional networking sites, yet I find several of them popping up on my timeline every day, and the number of people who have shared each is not funny. Then you have these cartoons (Dilbert and the copycats), and “guru quotes” that appear in the form of images that further spam your timelines! The only way I can think of these being useful is that they act as a negative indicator when you’re checking out the profile of someone you are looking to hire or do business with!

To summarise, LinkedIn seems to be an extremely badly engineered product on several counts, but thanks to network effects (so many people are already on it that entry barriers for competitors are really high) the site still manages to do well! I wonder what it will take to disrupt it. Facebook for business is not the answer for sure – the potential havoc caused by a breach in chinese walls there will scare people enough to not sign up.

What do you think? Here is their stock price movement for reference:

 

 

Selection bias and recommendation systems

Yesterday I was watching a video on youtube, and at the end of it it recommended another (the “top recommendation” at that point in time). This video floored me – it was a superb rendition of Endaro Mahaanubhaavulu by Mandolin U Shrinivas. Listen and enjoy as you read the rest of the post.

I was immediately bowled over by youtube’s recommendation system. I had searched for both Shrinivas and Endaro … in the not-so-distant past so Youtube had put two and two together and served me up an awesome rendition! I was so happy that I went to town twitter about it.

It was then that I realised that this was the firs time ever that I had noticed the top recommendation of Youtube. In other words, every time I use youtube, it recommends a video to me, but I seldom notice it. And I seldom notice it for a reason – they’re usually irrelevant and crap. The one time I like the video it throws up, though, I feel really happy and go gaga over the algorithm!

In other words, there’s a bias which I don’t know what its exactly called – the bias that when event happens in a certain direction, you tend to notice it and give credit where you think it’s due. And when it doesn’t happen that way, you simply ignore it!

In terms of larger implications, this is similar to how legends such as “lucky shirts” are born. When something spectacular happens, you notice everything that is associated with that spectacular event and give credit where you think it’s due (lucky shirt, lucky pen, etc.). But when things don’t go your way you think it’s despite the lucky shirt, not because the shirt has become unlucky.

It’s the same thing with belief in “god”. When you pray and something good happens to you after that, you believe that your prayers have been answered. However, when you pray and something good doesn’t happen, you ignore the fact that you prayed.

Coming back to recommendation systems such as Youtube’s, the problem is that it is impossible for a recommendation system to get recommendations right all the time. There will be times when you get it wrong. In fact, going by my personal experience with Youtube, Amazon, etc. most of the time you will get your recommendation wrong.

The key to building a recommendation system, thus, is to build it such that you maximise the chances of getting it right. Going one step further I can say that you should maximise the chances of getting it spectacularly right, in which case the customer will notice and give you credit for understanding her. Getting it “partly right” most of the time is not enough to catch the customer’s attention.

Putting marketing jargon on it, what you should focus on is delighting the customer some of the time rather than keeping her merely happy most of the time!

Selection bias in Catalunya?

Catalunya, where I spent two weeks last month, votes today in an “informal referendum” on whether to secede from Spain. This vote is non-binding after the Spanish Supreme Court declared an earlier “official referendum” called by the Catalan government as illegal. As I write this (11 pm IST; 6:30 in Catalunya), FT reports that there were “long lines to vote” in the informal referendum today. The same report in the opening paragraph mentions “with an overwhelming majority expected to back a proposal to break away from the rest of the country and form an independent state“.

Looking at it from a pure numbers perspective, this outcome is not to be unexpected. Consider two hypothetical voters and Barcelona residents Jordi and Jorge (the more observant reader might observe that these names have been chosen carefully) who are respectively for and against the secession. What are their incentives to come out and vote today, as against in a “real referendum”?

As far as Jorge is concerned, today’s vote doesn’t matter to him. Given that today’s referendum is “informal”, which way it goes has, in Jorge’s opinion, absolutely no impact on his life. Thus, he will consider the time and energy he would have to expend in queueing up and voting today to be not worth it. And thus he will not bother. And get on with his life. If today’s referendum were “real”, though, Jorge would have every incentive to register his opposition in the hope that his vote would help tip the vote towards a “no”, and thus he would be voting.

What about Jordi? Even though Jordi knows that today’s vote is only “informal”, he wants to send out a statement that he is in favour of secession. The way he sees it, the larger the majority by which today’s vote will come out in support of secession, the stronger the message that will be sent to Madrid, which he hopes should sooner or later be forced to relent, and permit a real independence vote. As far as Jordi is concerned, today’s vote has tremendous signalling value, and to this end he has every incentive to expend his time and energy and queue up and vote!

Based on this more Jordis are likely to come out today to vote, while less Jorges are likely to do so. Which means that today’s vote, thanks to the selection bias of one side being much more disposed to vote than the other, is likely to throw up a skewed result! Thus, it makes sense to treat the results with some salt.

But what about higher order effects? It is not hard for Jordis and Jorges to see what I’ve written above. Knowledge of this is not likely to change Jordi’s stance – just “victory” in today’s referendum is not enough for him. He is using today’s vote to primarily make a statement and the larger the “majority” that can be shown in favour of a “Yes” vote, the better it is for him. So the second order effects will not affect Jordi.

What about Jorge? He understands that while his vote doesn’t really matter since today’s referendum is not real, he knows that most people in favour of the referendum are likely to be voting today. Thus, the referendum is going to show an inflated majority in favour of “Yes”. So should he vote today to balance things out? On the one hand the effort might be worth it in terms of reducing the majority for the “Yes” vote. But then again, Jorge will realise that the selection bias in today’s vote is very very apparent, and his effort in marginally reducing the majority in favour of “Yes” may not actually be worth it! And so he will not vote.

So it is clear that today’s vote will show a significant majority in favour of secession from Spain, but that this is likely to be vastly overstated and very different from what things would be like had today’s vote been real. In that sense, if Spanish Prime Minister Mariano Rajoy and his advisers are smart, and realise the selection bias that is inherent, they can render today’s “informal referendum” rather pointless.

Why I’m inherently anti-muslim

So yes, I consider myself a secularist and all that. I have a number of friends who are from “minority  communities”. I still, however, think that parties like the Congress do go out of their way in order to woo “minority communities”. And (though i’ve never voted so far) unless the BJP majorly goofs up on some other axis, I’m significantly more likely to vote for the BJP than vote for the Congress. There are times when I want to try my hand at politics, and those times I wish I had friends in the BJP through whom I could try get a foothold. At times, however, I don’t care and become willing to join just about any party which will welcome me.

So this is the reason I think I’ve been inherently anti-Muslim. Back in kindergarten, there were two bullies in my class. Two absolute bullies, boys who were bigger than most others, who wouldn’t hesitate to be violent. When I was in second standard, one of them had scratched my leg so hard that there was a septic infection which took a long time to heal. I would see conscious efforts by these guys to be mean to others.

Back in junior school, there were three Muslims in my class. There was one quiet girl who I must admit I didn’t talk much to, but then back then I didn’t talk much to girls at all in general. And then there were two Muslim boys. And they happened to be two bullies.

So here I am, six or seven years old, and seeing a one-to-one correspondence between Muslim boys and bullies. I’m too young to know of stuff like “selection bias”, “small sample bias” and the like. Every day, on TV, I’m hearing anti-Pakistan rhetoric. And this was the period between the Shah Bano case and the Babri Masjid demolition, so most family members were also fairly anti-Muslim. And when India won against Pakistan in Sharjah for the first time (Srinath’s debut; October 1991), it was a victory not against Pakistan but against “those bloody Muslims”. When a week later (in the final league match), we lost narrowly trying to chase down 250 odd in utter darkness, it was because “those bloody Muslims had cheated us”.

You can be a rational person, but it is hard to go against biases that were created fairly early on in life. However hard you try to make rational decisions, it’s hard to go against something that’s been built into your instincts. And as I explained earlier, there was a clear one-to-one correspondence that I noticed that made me form my biases. Yes, I try to be rational and “secular”, but sometimes it takes effort to go against your instincts. So yes, I suffer from “anti-Muslim bias”.

Booze and volatility

Another of those things I’ve been intending to write for a really long time. Occasionally when I’m not feeling too good mentally, people ask me to go have a drink telling me that everything will be alright. However, given my limited experience in this I’m not too confident it will work. In fact, the only one time I tried drowning my sorrows in alcohol (this was over four years ago) I ended up feeling significantly worse, worse enough to have not tried it since.

The thing with booze is that it increases the volatility of your state of mind. This means that it will flatten out the curve according to which your mental state moves. So after you’ve had a drink or few, you are unlikely to remain in the same state that you were in that you started off at. You end up feeling either significantly better or significantly worse – and the chances of both these go up tremendously when you drink.

I know I have been so far acting based on one data point that went adversely, but I don’t know what causes the selection bias in people who have been through both sides significantly! Of feeling much worse and feeling much better after having some drinks. Why is it that even though all of them would’ve been through significantly worse after drinking at some point of time or the other, they tend to forget about it and only think of the times when they’ve felt better?

Is it that whether you feel good or not is some kind of a binary payoff depending upon the level of the state of mind (basically state of mind < cutoff => “bad”; state of mind >= cutoff implies “good”)? If this is true, then whenever you are “out of the money” (feeling bad), you dont’ really care if you go even more out of the money – your overall feeling doesn’t change by much. And so you don’t really mind the cases when the alcohol starts making you feel significantly worse. But then the barrier is ahead of you so by increasing volatility, you are giving yourself a better chance of surmounting the barrier so drinking makes sense! But then under this condition it doesn’t make sense to drink at all when you’re already feeling good!

Are there any other reasons you can think of for this selection bias? Why do people give more benefits to positive movement in state of mind as a function of drinking than to negative movement in state of mind? Or is it that volatility is a non-intuitive concept and “there’s a better chance you’ll feel better if you drink” is a simple way of communicating it? And let me know your experience about drink making you feel worse..