Mata Amrita Goes To New York Times

Remember that I had written recently that the pandemic is likely to change the practice of hugging, and the Mata Amrita Index? Now the New York Times has also covered it (possibly paywalled). It includes helpful graphics on “how to hug and how not to hug”.

It is an interesting article, quoting an expert on aerosols about what is the best way to hug. From what I gather, the key is to keep your faces turned away from each other. As long as you maintain this, hugging should still be fine.

[…] the safest thing is to avoid hugs. But if you need a hug, take precautions. Wear a mask. Hug outdoors. Try to avoid touching the other person’s body or clothes with your face and your mask. Don’t hug someone who is coughing or has other symptoms.

And remember that some hugs are riskier than others. Point your faces in opposite directions — the position of your face matters most. Don’t talk or cough while you’re hugging. And do it quickly. Approach each other and briefly embrace. When you are done, don’t linger. Back away quickly so you don’t breathe into each other’s faces. Wash your hands afterward.

Most of this seems fine. Only the last bit seems a bit difficult to implement – how do you wash your hands soon after hugging someone without offending them? I mean – I face this problem already. There are many people I come across whose hands I shake (this is all pre-pandemic) which leave me queasy and at unease until I have washed my hands. The challenge in this situation is how to efficiently wash your hands without making it explicit that the handshake wasn’t a pleasant one.

My favourite bit in the article, however, is the last one. It pertains to the “quality of hugs” that I’ve been talking about for a while now, and also happens to bring in Marie Kondo into the picture.

Dr. Marr noted that because the risk of a quick hug with precautions is very low but not zero, people should choose their hugs wisely.

“I would hug close friends, but I would skip more casual hugs,” Dr. Marr said. “I would take the Marie Kondo approach — the hug has to spark joy.”

Covid-19 superspreaders in Karnataka

Through a combination of luck and competence, my home state of Karnataka has handled the Covid-19 crisis rather well. While the total number of cases detected in the state edged past 2000 recently, the number of locally transmitted cases detected each day has hovered in the 20-25 range.

Perhaps the low case volume means that Karnataka is able to give out data at a level that few others states in India are providing. For each case, the rationale behind why the patient was tested (which is usually the source where they caught the disease) is given. This data comes out in two daily updates through the @dhfwka twitter handle.

There was this research that came out recently that showed that the spread of covid-19 follows a classic power law, with a low value of “alpha”. Basically, most infected people don’t infect anyone else. But there are a handful of infected people who infect lots of others.

The Karnataka data, put out by @dhfwka  and meticulously collected and organised by the folks at covid19india.org (they frequently drive me mad by suddenly changing the API or moving data into a new file, but overall they’ve been doing stellar work), has sufficient information to see if this sort of power law holds.

For every patient who was tested thanks to being a contact of an already infected patient, the “notes” field of the data contains the latter patient’s ID. This way, we are able to build a sort of graph on who got the disease from whom (some people got the disease “from a containment zone”, or out of state, and they are all ignored in this analysis).

From this graph, we can approximate how many people each infected person transmitted the infection to. Here are the “top” people in Karnataka who transmitted the disease to most people.

Patient 653, a 34 year-old male from Karnataka, who got infected from patient 420, passed on the disease to 45 others. Patient 419 passed it on to 34 others. And so on.

Overall in Karnataka, based on the data from covid19india.org as of tonight, there have been 732 cases where a the source (person) of infection has been clearly identified. These 732 cases have been transmitted by 205 people. Just two of the 205 (less than 1%) are responsible for 79 people (11% of all cases where transmitter has been identified) getting infected.

The top 10 “spreaders” in Karnataka are responsible for infecting 260 people, or 36% of all cases where transmission is known. The top 20 spreaders in the state (10% of all spreaders) are responsible for 48% of all cases. The top 41 spreaders (20% of all spreaders) are responsible for 61% of all transmitted cases.

Now you might think this is not as steep as the “well-known” Pareto distribution (80-20 distribution), except that here we are only considering 20% of all “spreaders”. Our analysis ignores the 1000 odd people who were found to have the disease at least one week ago, and none of whose contacts have been found to have the disease.

I admit this graph is a little difficult to understand, but basically I’ve ordered people found for covid-19 in Karnataka by number of people they’ve passed on the infection to, and graphed how many people cumulatively they’ve infected. It is a very clear pareto curve.

The exact exponent of the power law depends on what you take as the denominator (number of people who could have infected others, having themselves been infected), but the shape of the curve is not in question.

Essentially the Karnataka validates some research that’s recently come out – most of the disease spread stems from a handful of super spreaders. A very large proportion of people who are infected don’t pass it on to any of their contacts.

Meetings from home

For the last eight years, I’ve worked from home with occasional travel to clients’ offices. How occasional this travel has been has mostly depended on how far away the client is, and how insistent they are on seeing my face. Nevertheless, I’ve always made it a point to visit them for any important meetings, and do them in person.

Now, with the Covid-19 crisis, this hybrid model has broken down. Like most other people in the world, I work entirely from home nowadays, even for important meetings.

At the face of this, this seems like a good thing – for example, nowadays, however important a meeting is, the transaction cost is low. An hour long meeting means spending an hour for it (the time taken for prep is separate and hasn’t changed), and there’s no elaborate song-and-dance about it with travel and dressing up and all that.

While this seems far more efficient use of my time, I’m not sure I’m so happy about it. Essentially, I miss the sense of occasion. Now, an important meeting feels no different from an internal meeting with partners, or some trivial update.

Travel to and from an important meeting was a good time to mentally prepare for it, and then take stock of how it was gone. Now, until ten minutes before a meeting, I’m living my life as usual, and the natural boundaries that used to help me prep are also gone.

The other problem with remotely being there in large but important meetings is that it’s really easy to switch off. If you’re not the one who is doing a majority of the talking (or even the listening), it becomes incredibly hard to focus, and incredibly easy to get distracted elsewhere in the computer (it helps if your camera is switched off).

In a “real” physical meeting, however, large the gathering is, it is naturally easy for you to focus (and naturally more difficult to be distracted), and also easier to get involved in the meeting. An online meeting sometimes feels a bit too much like a group discussion, and without visual cues involved, it becomes really hard to butt in and make a point.

So once we are allowed to travel, and to meet, I’m pretty certain that I’ll start travelling a bit for work again. I’ll start with meetings in Bangalore (inter-city travel is likely to be painful for a very long time).

It might involve transaction cost, but a lot of the transaction cost gets recovered in terms of collateral benefits.

Expertise

During the 2008 financial crisis, it was fairly common to blame experts. It was widely acknowledged that it was the “expertise” of economists, financial markets people and regulators that had gotten us into the crisis in the first place. So criticising and mocking them were part of normal discourse.

For example, most of my learning about the 2008 financial crisis came from following blogs written by journalists, such as Felix Salmon, and generalist academics such as Tyler Cowen or Alex Tabarrok or Arnold Kling, rather than blogs written by financial markets experts or practitioners. I don’t think it was very different for too many people.

Cut to 2020 and the covid-19 crisis, and the situation is very different. You have a bunch of people mocking experts (epidemiologists, primarily), but this is in the minority. The generic Twitter discourse seems to be “listen to the experts”.

For example, there was this guy called Tomas Pueyo who wrote a bunch of really nice blog posts (on Medium) about the possible growth of the disease. He got heavily attacked by people in the epidemiology and medicine professions, and (surprisingly to me)  the general twitter discourse backed this up. “We don’t need a silicon valley guy telling us epidemiology”, went the discourse. “Listen to the experts”.

That was perhaps the beginning of the “I’m not an epidemiologist but” meme (not a particularly “fit” meme in terms of propagation, but one that continues to endure). For example, when I wrote my now famous tweetstorm about Bayes’s theorem and random testing 2-3 weeks back, a friend I was discussing with it advised me to “get the thing checked with epidemiologists before publishing”.

This came a bit too late after I’d constructed the tweetstorm, and I didn’t want to abandon it, and so I told him, “but then I’m an expert on Probability and Bayes’s Theorem, and so qualified to put this” and went ahead.

In any case, I have one theory as to why “listen to the expert” has become the dominant discourse in this crisis. It has everything to do with politics.

Two events took place in 2016 that the “twitter establishment” (the average twitter user, weighted by number of followers and frequency of tweeting, if I can say) did not like – the passing of the Brexit referendum and the election of President Trump.

While these two surprising events took place either side of the Atlantic, they were both seen as populist movements that were aimed at the existing establishment. Some commentators saw them as a backlash “against the experts”. The rise of Trump and Brexit (and Boris Johnson) were seen as part of this backlash against expertise.

And the “twitter establishment” (the average twitter user, weighted by number of followers and frequency of tweeting, if I can say) doesn’t seem to like either of these two gentlemen (Trump and Johnson), and they are supposed to be in power because of a backlash against experts. Closer home, in India, the Modi government allegedly doesn’t trust experts, which critics blame for ham-handed decisions like Demonetisation and pushing through of the Citizenship Amendment Act in the face of massive protests (the twitter establishment doesn’t like Modi either).

Essentially we have a bunch of political leaders who are unpopular with the twitter establishment, and who are in place because of their mistrust of expertise, and multiplying negative with negative, you get the strange situation where the twitter establishment is in love with experts now.

And so when mathematicians or computer scientists or economists (or other “Beckerians“) opine on covid-19, they are dismissed as being “not expert enough”. Because any criticism of expertise of any kind is seen as endorsement of the kind of politics that got Trump, Johnson or Modi into power. And the twitter establishment (the average twitter user, weighted by number of followers and frequency of tweeting, if I can say) doesn’t like that.

The corner Bhelpuri guy

There’s this guy who sells Bhelpuri off a cart that he usually stations at the street corner 100 metres from home. His wife (I think) sells platters of cut fruit from another (taller, and covered) cart stationed next to him.

I don’t have any particular fondness for them. I’ve never bought cut fruit platters, for example (I’m told by multiple people that I’m not part of the target segment for this product). I have occasionally bought bhelpuri from this guy, but it isn’t the best you can find in this part of town. Nevertheless, every afternoon until mid-March he would unfailingly bring his cart to the corner every afternoon and set up shop.

He has since fallen victim to the covid-19 induced lockdown. I have no clue where he is (I don’t know where he lives. Heck, I don’t even know his name). All I know is that he has already suffered a month and half of revenue loss. I don’t know if he has had enough stash to see him through this zero revenue period.

The lockdown, and the way it has been implemented, has resulted in a number of misalignments of incentives. The prime minister’s regular exhortations to businesses to not lay off employees or cut salaries, for example, has turned the lockdown into a capital versus labour issue. Being paid in full despite not going to work, (organised) labour is only happy enough to demand an extension of the lockdown. Capital is running out of money, with zero revenues and having to pay salaries, and wants a reopening.

Our bhelpuri guy, running a one-person business, represents both capital and labour. In fact, he represents the most common way of operating in India – self employment with very limited (and informal) employees. Whether he pays salaries or not doesn’t matter to him (he only has to pay himself). The loss of revenue matters a lot.

The informality of his business means that there is pretty much no way out for him to get any sort of a bailout. He possibly has an Aadhaar card (and other identity cards, such as a voter ID), and maybe even a bank account. Yet, the government (at whatever level) is unlikely to know that he exists as a business. He might have a BPL ration card that might have gotten him some household groceries, but that does nothing to compensate for his loss of business.

If you go by social media, or even comments made by politicians to the media or even to the Prime Minister, the general discourse seems to be to “extend the lockdown until we are completely safe, with the government providing wage subsidies and other support”. All this commentary completely ignores the most popular form of employment in India – informal businesses with a small number of informal employees.

If you think about it, there is no way this set of businesses can really be bailed out. The only way the government can help them is by letting them operate (even that might not help our Bhelpuri guy, since hygiene-conscious customers might think twice before eating off a street cart).

One friend mentioned that the only way these guys can exert political power is through their caste vote banks. However, I’m not sure if these vote banks have a regular enough voice (especially with elections not being nearby).

It may not be that much of a surprise to see some sort of protests or “lockdown disobedience” in case the lockdown gets overextended, especially in places where it’s not really necessary.

PS: I chuckle every time I see commentary (mostly on social media) that we need a lockdown “until we have a vaccine”. It’s like people have internalised the Contagion movie a bit too literally.

Fulfilling needs

We’re already in that part of the crisis where people are making predictions on how the world is going to change after the crisis. In fact, using my personal example, we’ve been in this part of the crisis for a long time now. So here I come with more predictions.

There’s a mailing list I’m part of where we’re talking about how we’ll live our lives once the crisis is over. A large number of responses there are about how they won’t ever visit restaurants or cafes, or watch a movie in a theatre, or take public transport, or travel for business, for a very very long time.

While it’s easy to say this, the thing with each of these supposedly dispensable activities is that they each serve a particular purpose, or set of purposes. And unless people are able to fulfil these needs that these activities serve with near-equal substitutes, I don’t know if these activities will decline by as much as people are talking about.

Let’s start with restaurants and cafes. One purpose they serve is to serve food, and one easy substitute for that is to take the food away and consume it at home. However, that’s not their only purpose. For example, they also provide a location to consume the food. If you think of restaurants that mostly survive because working people have their midday lunch there, the place they offer for consuming the food is as important as the food itself.

Then, restaurants and cafes also serve as venues to meet people. In fact, more than half my eating (and drinking) out over the last few years has been on account of meeting someone. If you don’t want to go to a restaurant or cafe to meet someone (because you might catch the virus), what’s the alternative?

There’s a certain set of people we might be inclined to meet at home (or office), but there’s a large section of people you’re simply not comfortable enough with to meet at a personal location, and a “third place” surely helps (also now we’ll have a higher bar on people we’ll invite home or to offices). If restaurants and cafes are going to be taboo, what kind of safe “third places” can emerge?

Then there is the issue of the office. For six to eight months before the pandemic hit, I kept thinking about getting myself an office, perhaps a co-working space, so that I could separate out my work and personal lives. NED meant I didn’t execute on that plan. However, the need for an office remains.

Now there’s greater doubt on the kind of office space I’ll get. Coworking spaces (at least shared desks) are out of question. This also means that coffee shops doubling up as “computer classes” aren’t feasible any more. I hate open offices as well. Maybe I have to either stick to home or go for a private office someplace.

As for business travel – they’ve been a great costly signal. For example, there had been some clients who I’d been utterly unable to catch over the phone. One trip to their city, and they enthusiastically gave appointments, and one hour meetings did far more than multiple messages or emails or phone calls could have done. Essentially by indicating that I was willing to take a plane to meet them, I signalled that I was serious about getting things done, and that got things moving.

In the future, business travel will “become more costly”. While that will still serve the purpose of “extremely costly signalling”, we will need a new substitute for “moderately costly signalling”.

And so forth. What we will see in the course of the next few months is that we will discover that a lot of our activities had purposes that we hadn’t thought of. And as we discover these purposes one by one, we are likely to change our behaviours in ways that will surprise us. It is too early to say which sectors or industries will benefit from this.

Post-Covid Stimulus

There are two ways in which businesses have been adversely affected by the ongoing Covid-19 crisis. Using phrases from my algorithmic trading days, let me call this “temporary impact” and “permanent impact”.

For some businesses, the Covid-19 crisis and the associated lockdown means about three months or so of zero (or near-zero) revenues. There is nothing inherently unsafe about these businesses that makes their sales take a “permanent hit” after the crisis has passed us by. Once the economy opens up again, these businesses can do businesses like they used to before, except that they are staring at a three-odd month revenue hole at the top of their P&L.

The second kind of businesses are going to be “permanently impacted”. They involve stuff that are going to be labelled as “unsafe” even after the crisis is over, and people are going to do less of these.

For example, bars and restaurants are going to see a “permanent impact” because of the crisis – people are not going to relish sitting in a public place with strangers in the next one year, and a large proportion of restaurants will have to go out of business.

Similarly any industry associated with travel – such as transport (airlines, railways, buses), hotels and taxis will see a permanent impact from the crisis. Real estate is also likely to be hit hard by the crisis. For all these sectors (and more), even after the economy is otherwise back in full swing, it will be a very long time before they see the sort of demand seen before the crisis.

Now that distinction is clear (I mean there will always be sectors that will sort of lie in the borderline), but at least we have a classification, we can use this to determine how governments respond to stimulate economies after the crisis.

Based on all the commentary going around, it seems like a given that governments and central banks need to do their bit to stimulate the economies. The collapse in both demand and supply thanks to the crisis means that governments will collect less taxes this year than expected. So while to some extent they will be able to possibly borrow more, or monetise deficit, or set aside money from other budgeted items, the funds available for stimulating businesses are likely to be limited.

So what sectors of the economy should the governments (and central banks) choose to spend this precious stimulus on? My take is that they should not bother about businesses that will be permanently impacted by the crisis – at best, the money will go into delaying the inevitable at some of these companies, and if structured in the form of a loan, will be highly unlikely to be unpaid.

Instead, the government should spend to stimulate sections of the economy where the impact of the crisis is temporary – in order to make the crisis “more temporary”. By giving cash to sectors that are going to be fundamentally solvent, this cash can be more assured to “travel around the economy”, thus giving more of the proverbial bang for the buck.

This essentially means that sectors most affected by the current crisis should not get any help from the governments – this might sound counterintuitive, but if the true intention of the government stimulus is to stimulate the economy rather than helping a particular set of companies, this makes eminent sense.

Oh, and in the Indian context, this seems like the perfect time to “let go” of Air India.

Tests per positive case

I seem to be becoming a sort of “testing expert”, though the so-called “testing mafia” (ok I only called them that) may disagree. Nothing external happened since the last time I wrote about this topic, but here is more “expertise” from my end.

As some of you might be aware, I’ve now created a script that does the daily updates that I’ve been doing on Twitter for the last few weeks. After I went off twitter last week, I tried for a couple of days to get friends to tweet my graphs. That wasn’t efficient. And I’m not yet over the twitter addiction enough to log in to twitter every day to post my daily updates.

So I’ve done what anyone who has a degree in computer science, and who has a reasonable degree of self-respect, should do – I now have this script (that runs on my server) that generates the graph and some mildly “intelligent” commentary and puts it out at 8am everyday. Today’s update looked like this:

Sometimes I make the mistake of going to twitter and looking at the replies to these automated tweets (that can be done without logging in). Most replies seem to be from the testing mafia. “All this is fine but we’re not testing enough so can’t trust the data”, they say. And then someone goes off on “tests per million” as if that is some gold standard.

As I discussed in my last post on this topic, random testing is NOT a good thing here. There are several ethical issues with that. The error rates with the testing means that there is a high chance of false positives, and also false negatives. So random testing can both “unleash” infected people, and unnecessarily clog hospital capacity with uninfected.

So if random testing is not a good metric on how adequately we are testing, what is? One idea comes from this Yahoo report on covid management in Vietnam.

According to data published by Vietnam’s health ministry on Wednesday, Vietnam has carried out 180,067 tests and detected just 268 cases, 83% of whom it says have recovered. There have been no reported deaths.

The figures are equivalent to nearly 672 tests for every one detected case, according to the Our World in Data website. The next highest, Taiwan, has conducted 132.1 tests for every case, the data showed

Total tests per positive case. Now, that’s an interesting metric. The basic idea is that if most of the people we are testing show positive, then we simply aren’t testing enough. However, if we are testing a lot of people for every positive case, then it means that we are also testing a large number of marginal cases (there is one caveat I’ll come to).

Also, tests per positive case also takes the “base rate” into effect. If a region has been affected massively, then the base rate itself will be high, and the region needs to test more. A less affected region needs less testing (remember we only  test those with a high base rate). And it is likely that in a region with a higher base rate, more positive cases are found (this is a deadly disease. So anyone with more than a mild occurrence of the disease is bound to get themselves tested).

The only caveat here is that the tests need to be “of high quality”, i.e. they should be done on people with high base rates of having the disease. Any measure that becomes a metric is bound to be gamed, so if tests per positive case becomes a metric, it is easy for a region to game that by testing random people (rather than those with high base rates). For now, let’s assume that nobody has made this a “measure” yet, so there isn’t that much gaming yet.

So how is India faring? Based on data from covid19india.org, until yesterday India had done (as of yesterday, 23rd April) about 520,000 tests, of which about 23,000 people have tested positive. In other words, India has tested 23 people for every positive test. Compared to Vietnam (or even Taiwan) that’s a really low number.

However, different states are testing to different extents by this metric. Again using data from covid19india.org, I created this chart that shows the cumulative “tests per positive case” in each state in India. I drew each state in a separate graph, with different scales, because they were simply not comparable.

Notice that Maharashtra, our worst affected state is only testing 14 people for every positive case, and this number is going down over time. Testing capacity in that state (which has, on an absolute number, done the maximum number of tests) is sorely stretched, and it is imperative that testing be scaled up massively there. It seems highly likely that testing has been backlogged there with not enough capacity to test the high base rate cases. Gujarat and Delhi, other badly affected states, are also in similar boats, testing only 16 and 13 people (respectively) for every infected person.

At the other end, Orissa is doing well, testing 230 people for every positive case (this number is rising). Karnataka is not bad either, with about 70 tests per case  (again increasing. The state massively stepped up on testing last Thursday). Andhra Pradesh is doing nearly 60. Haryana is doing 65.

Now I’m waiting for the usual suspects to reply to this (either on twitter, or as a comment on my blog) saying this doesn’t matter we are “not doing enough tests per million”.

I wonder why some people are proud to show off their innumeracy (OK fine, I understand that it’s a bit harsh to describe someone who doesn’t understand Bayes’s Theorem as “innumerate”).

 

Zoom in, zoom out

It was early on in the lockdown that the daughter participated in her first ever Zoom videoconference. It was an extended family call, with some 25 people across 9 or 10 households.

It was chaotic, to say the least. Family call meant there was no “moderation” of the sort you see in work calls (“mute yourself unless you’re speaking”, etc.). Each location had an entire family, so apart from talking on the call (which was chaotic with so many people anyways), people started talking among themselves. And that made it all the more chaotic.

Soon the daughter was shouting that it was getting too loud, and turned my computer volume down to the minimum (she’s figured out most of my computer controls in the last 2 months). After that, she lost interest and ran away.

A couple of weeks later, the wife was on a zoom call with a big group of her friends, and asked the daughter if she wanted to join. “I hate zoom, it’s too loud”, the daughter exclaimed and ran away.

Since then she has taken part in a couple of zoom calls, organised by her school. She sat with me once when I chatted with a (not very large) group of school friends. But I don’t think she particularly enjoys Zoom, or large video calls. And you need to remember that she is a “video call native“.

The early days of the lockdown were ripe times for people to turn into gurus, and make predictions with the hope that nobody would ever remember them in case they didn’t come through (I indulged in some of this as well). One that made the rounds was that group video calling would become much more popular and even replace group meetings (especially in the immediate aftermath of the pandemic).

I’m not so sure. While the rise of video calling has indeed given me an excuse to catch up “visually” with friends I haven’t seen in ages, I don’t see that much value from group video calls, after having participated in a few. The main problem is that there can, at a time, be only one channel of communication.

A few years back I’d written about the “anti two pizza rule” for organising parties, where I said that if you have a party, you should either have five or fewer guests, or ten or more (or something of the sort). The idea was that five or fewer can indeed have one coherent conversation without anyone being left out. Ten or more means the group naturally splits into multiple smaller groups, with each smaller group able to have conversations that add value to them.

In between (6-9 people) means it gets awkward – the group is too small to split, and too large to have one coherent conversation, and that makes for a bad party.

Now take that online. Because we have only one audio channel, there can only be one conversation for the entire group. This means that for a group of 10 or above, any “cross talk” needs to be necessarily broadcast, and that interferes with the main conversation of the group. So however large the group size of the online conversation, you can’t split the group. And the anti two pizza rule becomes “anti greater than or equal to two pizza rule”.

In other words, for an effective online conversation, you need to have four (or at max five) participants. Else you can risk the group getting unwieldy, some participants feeling left out or bored, or so much cross talk that nobody gets anything out of it.

So Zoom (or any other video chat app) is not going to replace any of our regular in-person communication media. It might to a small extent in the immediate wake of the pandemic, when people are afraid to meet large groups, but it will die out after that. OK, that is one more prediction from my side.

In related news, I swore off lecturing in Webinars some five years ago. Found it really stressful to lecture without the ability to look into the eyes of the “students”. I wonder if teachers worldwide who are being forced to lecture online because of the shut schools feel the way I do.

More on covid testing

There has been a massive jump in the number of covid-19 positive cases in Karnataka over the last couple of days. Today, there were 44 new cases discovered, and yesterday there were 36. This is a big jump from the average of about 15 cases per day in the preceding 4-5 days.

The good news is that not all of this is new infection. A lot of cases that have come out today are clusters of people who have collectively tested positive. However, there is one bit from yesterday’s cases (again a bunch of clusters) that stands out.

Source: covid19india.org

I guess by now everyone knows what “travelled from Delhi” is a euphemism for. The reason they are interesting to me is that they are based on a “repeat test”. In other words, all these people had tested negative the first time they were tested, and then they were tested again yesterday and found positive.

Why did they need a repeat test? That’s because the sensitivity of the Covid-19 test is rather low. Out of every 100 infected people who take the test, only about 70 are found positive (on average) by the test. That also depends upon when the sample is taken.  From the abstract of this paper:

Over the four days of infection prior to the typical time of symptom onset (day 5) the probability of a false negative test in an infected individual falls from 100% on day one (95% CI 69-100%) to 61% on day four (95% CI 18-98%), though there is considerable uncertainty in these numbers. On the day of symptom onset, the median false negative rate was 39% (95% CI 16-77%). This decreased to 26% (95% CI 18-34%) on day 8 (3 days after symptom onset), then began to rise again, from 27% (95% CI 20-34%) on day 9 to 61% (95% CI 54-67%) on day 21.

About one in three (depending upon when you draw the sample) infected people who have the disease are found by the test to be uninfected. Maybe I should state it again. If you test a covid-19 positive person for covid-19, there is almost a one-third chance that she will be found negative.

The good news (at the face of it) is that the test has “high specificity” of about 97-98% (this is from conversations I’ve had with people in the know. I’m unable to find links to corroborate this), or a false positive rate of 2-3%. That seems rather accurate, except that when the “prior probability” of having the disease is low, even this specificity is not good enough.

Let’s assume that a million Indians are covid-19 positive (the official numbers as of today are a little more than one-hundredth of that number). With one and a third billion people, that represents 0.075% of the population.

Let’s say we were to start “random testing” (as a number of commentators are advocating), and were to pull a random person off the street to test for Covid-19. The “prior” (before testing) likelihood she has Covid-19 is 0.075% (assume we don’t know anything more about her to change this assumption).

If we were to take 20000 such people, 15 of them will have the disease. The other 19985 don’t. Let’s test all 20000 of them.

Of the 15 who have the disease, the test returns “positive” for 10.5 (70% accuracy, round up to 11). Of the 19985 who don’t have the disease, the test returns “positive” for 400 of them (let’s assume a specificity of 98% (or a false positive rate of 2%), placing more faith in the test)! In other words, if there were a million Covid-19 positive people in India, and a random Indian were to take the test and test positive, the likelihood she actually has the disease is 11/411 = 2.6%.

If there were 10 million covid-19 positive people in India (no harm in supposing), then the “base rate” would be .75%. So out of our sample of 20000, 150 would have the disease. Again testing all 20000, 105 of the 150 who have the disease would test positive. 397 of the 19850 who don’t have the disease will test positive. In other words, if there were ten million Covid-19 positive people in India, and a random Indian were to take the test and test positive, the likelihood she actually has the disease is 105/(397+105) = 21%.

If there were ten million Covid-19 positive people in India, only one-fifth of the people who tested positive in a random test would actually have the disease.

Take a sip of water (ok I’m reading The Ken’s Beyond The First Order too much nowadays, it seems).

This is all standard maths stuff, and any self-respecting book or course on probability and Bayes’s Theorem will have at least a reference to AIDS or cancer testing. The story goes that this was a big deal in the 1990s when some people suggested that the AIDS test be used widely. Then, once this problem of false positives and posterior probabilities was pointed out, the strategy of only testing “high risk cases” got accepted.

And with a “low incidence” disease like covid-19, effective testing means you test people with a high prior probability. In India, that has meant testing people who travelled abroad, people who have come in contact with other known infected, healthcare workers, people who attended the Tablighi Jamaat conference in Delhi, and so on.

The advantage with testing people who already have a reasonable chance of having the disease is that once the test returns positive, you can be pretty sure they actually have the disease. It is more effective and efficient. Testing people with a “high prior probability of disease” is not discriminatory, or a “sampling bias” as some commentators alleged. It is prudent statistical practice.

Again, as I found to my own detriment with my tweetstorm on this topic the other day, people are bound to see politics and ascribe political motives to everything nowadays. In that sense, a lot of the commentary is not surprising. It’s also not surprising that when “one wing” heavily retweeted my article, “the other wing” made efforts to find holes in my argument (which, again, is textbook math).

One possibly apolitical criticism of my tweetstorm was that “the purpose of random testing is not to find out who is positive. It is to find out what proportion of the population has the disease”. The cost of this (apart from the monetary cost of actually testing) are threefold. Firstly, a large number of uninfected people will get hospitalised in covid-specific hospitals, clogging hospital capacity and increasing the chances that they get infected while in hospital.

Secondly, getting a truly random sample in this case is tricky, and possibly unethical. When you have limited testing capacity, you would be inclined (possibly morally, even) to use it on people who already have a high prior probability.

Finally, when the incidence is small, we need a really large sample to find out the true range.

Let’s say 1 in 1000 Indians have the disease (or about 1.35 million people). Using the Chi Square test of proportions, our estimate of the incidence of the disease varies significantly on how many people are tested.

If we test a 1000 people and find 1 positive, the true incidence of the disease (95% confidence interval) could be anywhere from 0.01% to 0.65%.

If we test 10000 people and find 10 positive, the true incidence of the disease could be anywhere between 0.05% and 0.2%.

Only if we test 100000 people (a truly massive random sample) and find 100 positive, then the true incidence lies between 0.08% and 0.12%, an acceptable range.

I admit that we may not be testing enough. A simple rule of thumb is that anyone with more than a 5% prior probability of having the disease needs to be tested. How we determine this prior probability is again dependent on some rules of thumb.

I’ll close by saying that we should NOT be doing random testing. That would be unethical on multiple counts.