I call this a lazy post since I didn’t originally write it as a blog post. I had written this as an email to a mailing list, and now thought it might make sense as a blog post. The reference to context: a prominent and well-respected member of the group had written a fairly lengthy argument, and ended it by saying “Maybe this calls for a good regression analysis….” . My reply is here.
I need to mention here that this mail to the group wasn’t responded to (apart from one tangential remark by Udupa). I don’t know if it simply got lost in the flood of mails on the list today, or if people on the group (in general, a very intelligent lot) don’t care for this kind of stuff, or if, for some reason, this caused discomfort of some sort. Anyway, I begin:
I think I had raised this point before in a similar context. it is about the use and misuse of statistical analysis. i think one lesson that ought to be learnt from the ongoing financial crisis and the events leading up to this is that statistical analysis, when misused, can have dangerous consequences, and this is not just for the people who are misusing the analyses.
there is this popular view that if there is data, then one ought to do statistical analysis, and draw conclusions from that, and make decisions based on these conclusions. unfortunately, in a large number of cases, the analysis ends up being done by someone who is not very proficient with statistics and who is basically applying formulae rather than using a concept. as long as you are using statistics as concepts, and not as formulae, I think you are fine. but you get into the “ok i see a time series here. let me put regression. never mind the significance levels or stationarity or any other such blah blah but i’ll take decisions based on my regression” then you are likely to get into trouble.
i think this is broadly the kind of point that is made by people like Paul Wilmott. that the problem is not with statistical analysis, but with the way people use statistical analysis.
ok, now that i’m done with my rant, I’m very sceptical about regression yielding any kind of conclusive results here. i think the number of data points we have here is too small to produce any meaningful results. of course i’m saying this without really looking at all the data that you want to might want to include. and i won’t be surprised if a few tens of papers get published on this topic. all based on statistical analyses. and the results all being orthogonal to one another.