Standard Error in Survey Statistics

Over the last week or more, one of the topics of discussion in the pink papers has been the employment statistics that were recently published by the NSSO. Mint, which first carried the story, has now started a whole series on it, titled “The Great Jobs Debate” where people from both sides of the fence have been using the paper to argue their case as to why the data makes or doesn’t make sense.

The story started when Mint Editor and Columnist Anil Padmanabhan (who, along with Aditya Sinha (now at DNA) and Aditi Phadnis (of Business Standard), ranks among my favourite political commentators in India) pointed out that the number of jobs created during the first UPA government (2004-09) was about 1 million, which is far less than the number of jobs created during the preceding NDA government (~ 60 million). And this has led to hue and cry from all sections. Arguments include leftists who say that jobless growth is because of too much reforms, rightists saying we aren’t creating jobs because we haven’t had enough reform, and some other people saying there’s something wrong in the data. Chief Statistician TCA Anant, in his column published in the paper, tried to use some obscurities in the sub-levels of the survey to point out why the data makes sense.

In today’s column, Niranjan Rajadhyaksha points out that the way employment is counted in India is very different from the way it is in developed countries. In the latter, employers give statistics of their payroll to the statistics collection agency periodically. However, due to the presence of the large unorganized sector, this is not possible in India so we resort to “surveys”, for which the NSSO is the primary organization.

In a survey, to estimate a quantity across a large sample, we simply take a much smaller sample, which is small enough for us to rigorously measure this quantity. Then, we try and extrapolate the results to the large sample. The key thing in survey is “standard error”, which is a measure of error that the “observed statistic” is different from the “true statistic”. What intrigues me is that there is absolutely no mention of the standard error in any of the communication about this NSSO survey (again I’m relying on the papers here, haven’t seen the primary data).

Typically, when we measure something by means of a survey, the “true value” is usually expressed in terms of the “95% confidence range”. What we say is “with 95% probability, the true value of XXXX lies between Xmin and Xmax”. An alternate way of representation is “we think the value of XXXX is centred at Xmid with a standard error of Xse”. So in order to communicate numbers computed from a survey, it is necessary to give out two numbers. So what is the NSSO doing by reporting just one number (most likely the mid)?

Samples used by NSSO are usually very small. At least, they are very small compared to the overall population, which makes the standard error to be very large. Could it be that the standard error is not reported because it’s so large that the mean doesn’t make sense? And if the standard error is so large, why should we even use this data as a basis to formulate policy?

So here’s my verdict: the “estimated mean” of the employment as of 2009 is not very different from the “estimated mean” of the employment as of 2004. However, given that the sample sizes are small, the standard error will be large. So it is very possible that the true mean of employment as of 2009 is actually much higher than the true mean of 2004 (by the same argument, it could be the other way round, which points at something more grave). So I conclude that given the data we have here (assuming standard errors aren’t available), we have insufficient data to conclude anything about the job creation during the UPA1 government, and its policy implications.

Orators and Writers

Yesterday I was reading an op-ed in Mint when it struck me was that this particular columnist never argues – in the sense that he never constructs an argument using inductive or deductive logic. His method or argument is to say the same thing over and over again – in different ways, using different metaphors. He hopes to make his point by way of reinforcement, and considering his popularity and his ubiquity across the media, I’m sure it works for a lot of people (though not for me).

Then I started thinking about people who are known to be “great orators”, mostly from the Indian political space. I started thinking about Vajpayee, about Chandrashekhar and several other similar people. I discovered the same thing about them. That they seldom construct an argument using deductive or inductive logic. Their way of getting the point across is the same as the Mint columnist’s – to say the same thing forcefully and in several different ways.

And thinking about it, it seems quite logical. When you are addressing a large audience, you will need to take everyone along. You will need to ensure that everyone is clued in on what you are speaking on. And when you speak, there is no way for the listener to take a step or two back if he/she misses something you said. Unlike text, the speech has to be interpreted in one parse. So if you are to be a great orator, you need to make sure that you take the audience along; that you construct your speech in such a way that even if someone gets distracted for a few words they can join back and appreciate the rest of the speech. Hence you are better off indulging in rhetoric rather than argument.

A writer, on the other hand, has no such compulsions. It is easy for his reader to go back and forth and parse the essay in whatever order he deems fit. As long as he keeps the language simple, the reader is likely to go along with him. On the other hand, if the writer indulges in rhetoric, the reader is likely to get bored and that could be counterproductive. Hence, writers are more into argument than into rhetoric.

Which brings me back to the Mint columnist I was reading yesterday who, as far as I know, has been a prolific writer but not as much as an orator (or maybe he is but I wouldn’t know since he lives abroad). And I’m puzzled that he has settled on a rhetorical style rather than an argumentative style. I’ve happened to meet him and even then he was mostly using rhetoric rather than reasoning in his arguments.

So yeah, the essence is that there are two ways in which you can construct arguments – by logical reasoning which is mostly preferred by writers and by rhetoric which is preferred by orators. I’m not sure how successful you can be if you interchange styles.