Risk and data

A while back a group of <a large number of scientists> wrote an open letter to the Prime Minister demanding greater data sharing with them. I must say that the letter is written in academic language and the effort to understand it was too much, but in the interest of fairness I’ll put a screenshot that was posted on twitter here.

I don’t know about this clinical and academic data. However, the holding back of one kind of data, in my opinion, has massively (and negatively) impacted people’s mental health and risk calculations.

This is data on mortality and risk. The kind of questions that I expect government data to have answered was:

  1. If I get covid-19 (now in the second wave), what is the likelihood that I will die?
  2. If my oxygen level drops to 90 (>= 94 is “normal”), what is the likelihood that I will die?
  3. If I go to hospital, what is the likelihood I will die?
  4. If I go to ICU what is the likelihood I will die?
  5. What is the likelihood of a teenager who contracts the virus (and is otherwise in good health) dying of the virus?

And so on. Simple risk-based questions whose answers can help people calibrate their lives and take calculated enough risks to get on with it without putting themselves and their loved ones at risk.

Instead, what we find from official sources are nothing but aggregates. Total numbers of people infected, dead, recovered and so on. And it is impossible to infer answers to the “risk questions” based no that.

And who fill in the gaps? Media of course.

I must have discussed “spectacularness bias” on this blog several times before. Basically the idea is that for something to be news, it needs to carry information. And an event carries information if it occurs despite having a low prior probability (or not occurring despite a high prior probability). As I put it in my lectures, “‘dog bites man’ is not news. ‘man bits dog’ is news”.

So when we rely on media reports to fill in our gaps in our risk systems, we end up taking all the wrong kinds of lessons. We learn that one seventeen year old boy died of covid despite being otherwise healthy. In the absence of other information, we assume that teenagers are under grave risk from the disease.

Similarly, cases of children looking for ICU beds get forwarded far more than cases of old people looking for ICU beds. In the absence of risk information, we assume that the situation must be grave among children.

Old people dying from covid goes unreported (unless the person was famous in some way or the other), since the information content in that is low. Young people dying gets amplified.

Based on all the reports that we see in the papers and other media (including social media), we get an entirely warped sense of what the risk profile of the disease is. And panic. When we panic, our health gets worse.

Oh, and I haven’t even spoken about bad risk reporting in the media. I saw a report in the Times of India this morning (unable to find a link to it) that said that “young are facing higher mortality in this wave”. Basically the story said that people under 60 account for a far higher proportion of deaths in the second wave than in the first.

Now there are two problems with that story.

  1. A large proportion of over 60s in India are vaccinated, so mortality is likely to be lower in this cohort.
  2. What we need is the likelihood of a person under 60 dying upon contracting covid. NOT the proportion of deaths accounted for by under 60s. This is the classic “averaging along the wrong axis” that they unleash upon you in the first test of any statistics course.

Anyway, so what kind of data would have helped?

  1. Age profile of people testing positive, preferably state wise (any finer will be noise)
  2. Age profile of people dying of covid-19, again state wise

I’m sure the government collects this data. Just that they’re not used to releasing this kind fo data, so we’re not getting it. And so we have to rely on the media and its spectacularness bias to get our information. And so we panic.

PS: By no means am I stating that covid-19 is not a risk. All I am stating is that the information we have been given doesn’t help us make good risk decisions