More Karnataka: Averaging between ULB elections and 2008 elections

Recently I met my MLA, who is from the BJP, and told him about my analysis extrapolating from the recent urban local body elections, which gave Congress an absolute majority. He countered that the BJP has never been strong at the sub-state level so one shouldn’t read too much into these elections. So I decided to create this tool which uses a slider which you can use to decide how much importance to give to the ULB polls. A value of 0 represents the seat distribution as per the 2008 elections. A value of 1 uses an extrapolation of the ULB elections only (without using information from the 2008 polls). I hope you have fun with this tool.

You might also notice that I now give you the actual seat distribution party-wise.

India State Wise Road Density

Roads are one of the strongest measures of economic activity. The denser the road network in a particular area, the easier it is for people in the area to connect with and trade with each other, thus leading to a higher degree of economic activity. The graph here compares the length of roads across Indian states in 2011.

Source: Statistical Year Book India, 2013
Source: Statistical Year Book India, 2013

 

It would also be interesting to see how different states compare in terms of addition to road length between 2009 and 2011. The graph here shows the CAGR (compounded annual growth rate) in total length of roads in each state between 2009 and 2011.

Source: Statistical Year Book India, 2013
Source: Statistical Year Book India, 2013

Growth in Per Capita Consumption Expenditure

Measuring people’s incomes is a hard task. There is considerable incentive to both under and overestimate one’s own income while responding to a survey. Thus, a good proxy for measuring incomes is to look at the total consumption expenditure.

One of the assignments for the ongoing batch of the Graduate Certificate in Public Policy Program asked them to analyze how the quality of life in India has changed over the last 50 years. Rishabh Raj responded to the assignment with the graph that is presented below, that shows how the per capita consumption expenditure has grown over the last 50 years. Note that the figures are adjusted for inflation. Offered without further comment:

 

Source: http://data.worldbank.org/country/india
Source: http://data.worldbank.org/country/india Numbers in 2000 rupees. 

Bangalore North, South, Central, Rural

I don’t know if you want to call this gerrymandering, but I just want to pictorially map out the areas of Bangalore that fall under different parliamentary constituencies.

White: Chickballapur

Black: Bangalore North

Red: Bangalore Central

Green: Bangalore South

Blue: Bangalore Rural

Source: http://openbangalore.org/
Source: http://openbangalore.org/

Karnataka – effect of swings from 2008 election

You can use the slider below to see how changes in vote share of major parties affects the seat distribution. The “base” here is the vote share in the 2008 Assembly Elections. The numbers on the sliders are in percentages.

Update
This new version uses the vote shares in various districts during the recent Urban Local Body elections to account for the BJP split. As you can see, there is a slider that allows you to indicate how much of the split of the BJP’s votes as per the recent ULB elections will reflect in the forthcoming assembly elections. The reason for this slider is due to feedback that the split of BJP votes in the local body elections may not translate directly to the assembly elections.

There is also a slider called ‘performance impact’. This is based on data from the Daksh survey where samples of voters were asked to evaluate their sitting MLA. The way to use this is that when the slider is at 0, there is no impact of the MLA’s performance on vote share in the forthcoming elections. When it is at 5, an MLA who voters are “extremely happy” with will get 5% additional votes, and an MLA who voters are extremely unhappy with gets 5% less votes than what he did last time.

Wahhabism versus Salafism

in collaboration with Narayan Ramachandran

During a meeting at the Takshashila office last week, senior advisor to Takshashila Narayan Ramachandran pointed out what he thought was a change in language over the years in describing Islamists. “In the aftermath of 9/11”, Narayan said, “the dominant word used was wahhabi. However, over the years its use has waned and has been instead replaced by salafi“. As good quants, we decided it would be best if we could back up this hypothesis with data before trying to understand the shift.

Now, not-so-recently, Google started this service called “google trends” which gives a measure of the popularity of a particular word or phrase over the years, and allows us to compare the usage of various phrases. We used Google Trends to compare the usage of Wahhabi and Salafi and found this:

This graph clearly shows that by 2004 the word “Wahhabi” was already out of fashion and “Salafi” was much more widely used.

If you look at only the US, however, the situation is different. Though there was significant fluctuation, till about 2006, the usage of Wahhabi and Salafi was comparable. Then from 2006-07 onwards, the usage of “Salafi” pulled away much ahead of that of Wahhabi, and it has remained that way, in accordance with worldwide trends.

It would be interesting to analyze, however, the reasons for this shift in nomenclature.

Update
Pavan Srinath weighs in that “Wahhabi” can also be spelt as “Wahabi”. And that spelling is actually on the upswing:
http://www.google.com/trends/explore?hl=en#q=wahabi%2C%20wahhabi%2C%20salafi&cmpt=q

In the US, though, both Wahhabi and Wahabi are on the decline:
http://www.google.com/trends/explore?hl=en#q=wahabi,+wahhabi,+salafi&cmpt=q&geo=US

So it appears that what is on the decline is the spelling “Wahhabi” more than the word itself.

A Victory By Default for the Congress in Karnataka

Voters like to chase winners. Everybody wants their own MLA to belong to the winning party, for they perceive that the benefits to the constituency in that case are likely to be much greater. Some people vote by ideology. Others by caste. Many, however, just try to second-guess who the winner is, and vote for them.

A recent survey conducted by Daksh asked voters across Karnataka if they had voted for the winning candidate. A whopping 75% of the voters said they did so. In the 2008 Assembly Elections, the average vote share gained by a winning candidate was 43%. The graph here shows, constituency wise, the vote share of the winning party in the 2008 Elections and the number of survey respondents in the constituency who claimed to vote for the winner.

 

Source: The Daksh Survey of Karnataka MLAs and empoweringindia.org
Source: The Daksh Survey of Karnataka MLAs and empoweringindia.org

Notice how in most constituencies, the proportion of survey respondents who have said they’ve voted for the winner is far higher than the vote share the winning candidate got in the elections. This can mean one of two things. One possibility is that the sample that Daksh used for their survey was heavily biased. However, given that the result is consistent across all constituencies, it is unlikely that they would have got a biased sample in all constituencies. The other possibility is that everybody likes to be associated with the winner.

In most elections, it is difficult to gauge who “everyone else” is voting for. People sometimes rely on opinion or exit polls, but they are usually based on small samples. This election is different. It was immediately preceded by elections to 200 odd urban local bodies all over Karnataka. All towns and cities in Karnataka went to the polls only a month or two back. While there are some cities where the local bodies have been hung, if you look across the state, the verdict is clear. And people are aware of that verdict.

Given the Congress’s strong performance in the recent urban local body elections, the marginal voter is likely to swing to the Congress. Media reports indicate that all is not well with the party and there is considerable bickering for tickets. However, the weight of the marginal votes is going to be enough to push the Congress comfortably over the line. The bickering for tickets can be looked at in another way – Congressmen know that their party is going to form the government, and are extra keen on riding a winning horse.

Journal of Bad Statistics: Road Accidents in India

Occasionally, this blog takes a break from presenting interesting data to critiquing data-related journalism in the media. Our object of attention for this post is a report in the Hindustan Times that states that “Maharashtra has highest number of road accidents in the country”. The headline is factually correct, if you go by the data on the website of the Ministry of Road Transportation and Highways. The problem, however, is that it is a meaningless statistic.

It might be intuitive to you that one cannot compare the number of accidents in a large state like Maharashtra to that of a small state of Manipur – the former is so much larger than  the latter that it is bound to have more accidents. Extending this argument, does it makes sense to compare states on the basis of sheer number of accidents? Does the statistic of “state with highest number of accidents” make sense? If not, what is a good metric to compare road safety in various states?

Comparing values that are measured in ‘absolute numbers’ across geographies makes no sense, for it doesn’t take into account the difference in size of the various geographies. In order to get a good comparison we need to “normalize” the measure that takes into account the relative sizes of the geographies. And it is important that we use the right metric in order to normalize the measures.

So how do we compare the accident rates in Maharashtra and Manipur, given their different sizes? An intuitive normalizing factor is the state population. Population might be a good metric for comparing birth rates or disease incidence rates, but road accidents? Population doesn’t account for people in one state driving more than in another state. We need a better metric.

Going back to the basics, what are we trying to achieve here by comparing accident rates across states? The accident rates is probably going to be used as a proxy for road safety. So how would you compare road safety across two different regions? A good metric, I would argue, is the likelihood of having an accident if you were to drive 1 kilometer. Or the number of accidents per vehicle kilometer. Notice that this at once takes care of both problems we have discussed above – sizes of states as well as propensity of people in various states to drive.

However, whether this is the best metric is debatable. For example, this metric ignores the “vehicle mix” in various states – so would “passenger kilometer” (rather than “vehicle kilometer”) be better? Perhaps. Again, this metric assumes that all kinds of roads are similar, and treats traveling along a kilometer of a highway as equivalent to traveling a kilometer on a village road. There are no “perfect” metrics or “normalizing factors” – so we have to choose one that is “good enough” and go with it.

Now, let us compare states based on their likelihood of accidents. Unfortunately, data on “vehicle kilometers” is hard to come by – in the absence of tolled roads, no one really keeps track of this. So we need to use a proxy. Again, it is debatable about what is the best proxy (remember there was already a debate on what is the best measure), but for ease of data capture (if not anything else) let us use “accidents per total road length” as a metric here. Drawbacks of this metric is that it doesn’t capture how busy these roads are, and are only a loose proxy for how much people drive.

The graph below shows the relative safety of roads in Indian states. Based on accidents per 10000 kilometers of roads, we see that Maharashtra (green) is quite close to the national average (blue). It turns out that it is the union territory of Lakshadweep that is the clear outlier on number of accidents per kilometer of road.

Road accidents per 10000 Km of roads, per state (2008). Source: Ministry of Road Transportation and Highways
Road accidents per 10000 Km of roads, per state (2008). Source: Ministry of Road Transportation and Highways

Based on this, we can say that the article in the Hindustan Times quoted at the beginning of this piece, while factually correct, does not present a correct picture.

Importance of candidate’s caste in voting

Not-for-profit Daksh recently conducted a massive survey in Karnataka which tried to understand voter preferences, evaluate MLA performance, etc. This was a comprehensive survey covering over 12000 voters across all districts in Karnataka. Apart from capturing demographic information, the survey asks questions on what candidates look for in a candidate and what issues they think are important for an MLA.

One of the questions asked was the importance of a candidate’s caste when it comes to voting. Voters were asked to indicate if it was “very important”, “important” or “not important”. For purpose of my analysis I’ve given a score of 1 for “very important” and 0.5 for “important” and 0 for “not important”. The relationship between a voter’s annual family income with his perception on the importance of caste is extremely interesting, as this graph indicates.

Data Source: Daksh Survey on Perceptions about Karnataka MLAs. Thanks to Harish Narasappa of Daksh for sharing data with me
Data Source: Daksh Survey on Perceptions about Karnataka MLAs. Thanks to Harish Narasappa of Daksh for sharing data with me

Indians are killing themselves, while farmers are dying in Vidarbha

Palagummi Sainath of the Hindu has, over the years, brought attention to the problem of farmer suicides in India, particularly in Vidarbha. While it is true that in 2011 over 14000 farmers killed themselves in Maharashtra (data not available at a finer level) , this focus on farmer suicides masks the larger problem of suicide in India.

As you can see from the following graph, the suicide rate among non-farmers is an order of magnitude greater than the suicide rate among farmers.

Suicide data from http://ncrb.nic.in/CD-ADSI2011/table-2.11.pdf . Population data from Census 2011. Since occupation-wise population data is not yet available, we have assumed that 60% of the population of each state is engaged in agriculture.
Suicide data from http://ncrb.nic.in/CD-ADSI2011/table-2.11.pdf . Population data from Census 2011. Since occupation-wise population data is not yet available, we have assumed that 60% of the population of each state is engaged in agriculture.