science – Pertinent Observations

The Science in Data Science

The science in “data science” basically represents the “scientific method”.

It’s a decade since the phrase “data scientist” got coined, though if you go on LinkedIn, you will find people who claim to have more than two years of experience in the subject.

The origins of the phrase itself are unclear, though some sources claim that it came out of this HBR article in 2012 written by Thomas Davenport and DJ Patil (though, in 2009, Hal Varian, formerly Google’s Chief Economist had said that the “sexiest job of the 21st century” will be that of a statistician).

Some of you might recall that in 2018, I had said that “I’m not a data scientist any more“. That was mostly down to my experience working with companies in London, where I found that data science was used as a euphemism for “machine learning” – something I was incredibly uncomfortable with.

With the benefit of hindsight, it seems like I was wrong. My view on data science being a euphemism for machine learning came from interacting with small samples of people (though it could be an English quirk). As I’ve dug around over the years, it seems like the “science” in data science comes not from the maths in machine learning, but elsewhere.

One phenomenon that had always intrigued me was the number of people with PhDs, especially NOT in maths, computer science of statistics, who have made a career in data science. Initially I dismissed it down to “the gap between PhD and tenure track faculty positions in science”. However, the numbers kept growing.

The more perceptive of you might know that I run a podcast now. It is called “Data Chatter“, and is ten episodes old now. The basic aim of the podcast is for me to have some interesting conversations – and then release them for public benefit. Yeah, yeah.

So, there was this thing that intrigued me, and I have a podcast. I did what you would have expected me to do – get on a guest who went from a science background to data science. I got Dhanya, my classmate from school, to talk about how her background with a PhD in neuroscience has helped her become a better data scientist.

It is a fascinating conversation, and served its primary purpose of making me understand what the “science” in data science really is. I had gone into the conversation expecting to talk about some machine learning, and how that gets used in academia or whatever. Instead, we spoke for an hour about designing experiments, collecting data and testing hypotheses.

The science in “data science” basically represents the “scientific method“. What Dhanya told me (you should listen to the conversation) is that a PhD prepares you for thinking in the scientific method, and drills into you years of practice in it. And this is especially true of “experimental” PhDs.

And then, last night, while preparing the notes for the podcast release, I stumbled upon the original HBR article by Thomas Davenport and DJ Patil talking about “data science”. And I found that they talk about the scientific method as well. And I found that I had talked about it in my newsletter as well – only to forget it later. This is what I had written:

Reading Patil and Davenport’s article carefully suggests, however, that companies might be making a deliberate attempt at recruiting pure science PhDs for data scientist roles.

The following excerpts from the article (which possibly shaped the way many organisations think about data science) can help us understand why PhDs are sought after as data scientists.

Data scientists’ most basic, universal skill is the ability to write code. This may be less true in five years’ time (Ed: the article was published in late 2012, so we’re almost “five years later” now)

Perhaps it’s becoming clear why the word “scientist” fits this emerging role. Experimental physicists, for example, also have to design equipment, gather data, conduct multiple experiments, and communicate their results.

Some of the best and brightest data scientists are PhDs in esoteric fields like ecology and systems biology.

It’s important to keep that image of the scientist in mind—because the word “data” might easily send a search for talent down the wrong path

Patil and Davenport make it very clear that traditional “data analysts” may not make for great data scientists.

We learn, and we forget, and we re-learn. But learning is precisely what the scientific method, which underpins the “science” in data science, is all about. And it is definitely NOT about machine learning.

Beckerian Disciplines

When Gary Becker was awarded the “Nobel Prize” (or whatever its official name is) for Economics, the award didn’t cite any single work of his. Instead, as Justin Wolfers wrote in his obituary,

He was motivated by the belief that economics, taken seriously, could improve the human condition. He founded so many new fields of inquiry that the Nobel committee was forced to veer from the policy of awarding the prize for a specific piece of work, lauding him instead for “having extended the domain of microeconomic analysis to a wide range of human behavior and interaction, including nonmarket behavior.”

Or as Matthew Yglesias put it in his obituary of Becker,

Becker is known not so much for one empirical finding or theoretical conjecture, as for a broad meta-insight that he applied in several areas and that is now so broadly used that many people probably don’t realize that it was invented relatively recently.

Becker’s idea, in essence, was that the basic toolkit of economic modeling could be applied to a wide range of issues beyond the narrow realm of explicitly “economic” behavior. Though many of Becker’s specific claims remain controversial or superseded by subsequent literature, the idea of exploring everyday life through a broadly economic lens has been enormously influential in the economics profession and has altered how other social sciences approach their issues

Essentially Becker sort of pioneered the idea of using economic reasoning for fields outside traditional economics. It wasn’t always popular – for example, his use of economics methods in sociology was controversial, and “traditional sociologists” didn’t like the encroachment into their field.

However, Becker’s ideas endured. It is common nowadays for economists to explore ideas traditionally considered outside the boundaries of “standard economics”.

I think this goes well beyond economics. I think there are several other fields that are prone to “go out of syllabus” – where concepts are generic enough that they can be applied to areas traditionally outside the fields.

One obvious candidate is mathematics – most mathematical problems come from “real life”, and only the purest of mathematicians don’t include an application from “real life” (well outside of mathematics) while writing a mathematical paper. Immediately coming to mind is the famous “Hall’s Marriage Theorem” from Graph Theory.

Speaking of Graph Theory, Computer Science is another candidate (especially the area of algorithms, which I sort of specialised in during my undergrad). I remember being thoroughly annoyed that papers and theses that would start so interestingly with a real-life problem would soon involve into inscrutable maths by the time you got to the second section. I remember my B.Tech. project (this was taken rather seriously at IIT Madras) being about what I had described as a “Party Hall Problem” (this was in Online Algorithms).

Rather surprisingly (to me), another area whose practitioners are fond of encroaching into other subjects is physics. This old XKCD sums it up

Complex Systems (do you know most complex systems scientists are physicists by training?) is another such field. There are more.

In any case, assuming no one else has done this already, I hereby christen all these fields (whose practitioners are prone to venturing into “out of syllabus matters”) as “Beckerian Disciplines” in honour of Gary Becker (OK I have a economics bias but I’m pretty sure there have been scientists well before Becker who have done this).

And then you have what I now call as “anti-Beckerian Disciplines” – areas that get pissed off that people from other fields are “invading their territory”. In Becker’s own case, the anti-Beckerian Discipline was Sociology.

When all university departments talk about “interdisciplinary research” what they really need is Beckerians. People who are able and willing to step out of the comfort zones of their own disciplines to lend a fresh pair of eyes (and a fresh perspective) to other disciplines.

The problem with this is that they can encounter an anti-Beckerian response from people trying to defend their own turf from “outside invasion”. This doesn’t help the cause of science (or research of any kind) but in general (well, a LOT of exceptions exist), academics can be a prickly and insecure bunch forever playing zero-sum status games.

With the covid-19 virus crisis, one set of anti-Beckerians who have emerged is epidemiologists. Epidemiology is a nice discipline in that it can be studied using graph theory, non-linear dynamics or (as I did earlier today) simple Bayesian maths or so many other frameworks that don’t need a degree in biology or medicine.

One class of people who have become loud of twitter of late is what I call the "testing mafia". Every time someone tweets some covid-19 related news that seems mildly positive, they instinctively react "but we aren't testing enough".

I want to take a small maths class here

— Karthik (@karthiks) April 15, 2020

And epidemiologists are not happy (I’m not talking about my tweet specifically but this is a more general comment) that their turf is being invaded upon. “Listen to the experts”, they are saying, with the assumption that the experts in question here are them. People are resorting to credentialism. They’re adding “, PhD” to their names on twitter (a particularly shady credentialist practice IMHO). Questioning credentials and locus standi of people producing interesting analysis.

Enough of this rant. Since you’ve come all the way, I leave you with this particularly awesome blogpost by Tyler Cowen, who is a particularly Beckerian economist, about epidemiologists. Sample this:

Now, to close, I have a few rude questions that nobody else seems willing to ask, and I genuinely do not know the answers to these:

a. As a class of scientists, how much are epidemiologists paid? Is good or bad news better for their salaries?

b. How smart are they? What are their average GRE scores?

c. Are they hired into thick, liquid academic and institutional markets? And how meritocratic are those markets?

d. What is their overall track record on predictions, whether before or during this crisis?

e. On average, what is the political orientation of epidemiologists? And compared to other academics? Which social welfare function do they use when they make non-trivial recommendations?

Pregnancy, childbirth, correlation, causation and small samples

When you’re pregnant, or just given birth, people think it’s pertinent to give you unsolicited advice. Most of this advice is couches in the garb ob “traditional wisdom” and as you might expect, the older the advisor the higher the likelihood of them proffering such advice.

The interesting thing about this advice is the use of fear. “If you don’t do this you’ll forever remain fat”, some will say. Others will forbid you from eating some thing else because it can “chill the body”.

If you politely listen to such advice the advice will stop. But if you make a counter argument, these “elders” (for the lack of a better word) make what I call the long-term argument. “Now you might think this might all be fine, but don’t tell me I didn’t advice you when you get osteoporosis at the age of 50”, they say.

While most of this advice is well intentioned, the problem with most such advice is that it’s based on evidence from fairly small samples, and are prone to the error of mistaking correlation for causation.

While it is true that it was fairly common to have dozens of children even two generations ago in india, the problem is that most of the advisors would have seen only a small number of babies based on which they form their theories – even with a dozen it’s not large enough to confirm the theory to any decent level of statistical significance.

The other problem is that we haven’t had the culture of scientific temperament and reasoning for long enough in india for people to trust scientific methods and results – people a generation or two older are highly likely to dismiss results that don’t confirm their priors.

And add to this confirmation bias – where cases of people violating “traditional wisdom” and then having some kind of problem are more likely to be noticed rather than those that had issues despite following “traditional wisdom” and you can imagine the level of non-science that can creep into so-called conventional wisdom.
We’re at a hospital that explicitly tries to reverse these pre existing biases (I’m told that at a lactation class yesterday they firmly reinforced why traditional ways of holding babies while breastfeeding are incorrect) and that, in the face of “elders”‘ advice, can lead to potential conflict.

On the one hand we have scientific evidence given by people who you aren’t likely to encounter too many more times in life. On the other you have unscientific “traditional” wisdom that comes with all kinds of logical inconsistencies given by people you encounter on a daily basis.

Given this (im)balance, is there a surprise at all that scientific evidence gets abandoned in favour of adoption and propagation of all the logical inconsistencies?

PS: recently I was cleaning out some old shelves and found a copy of this book called “science, non science and the paranormal”. The book belonged to my father, and it makes me realise now that he was a so-called “rationalist”.

At every opportunity he would encourage me to question things, and not take them at face value. And ever so often he’d say “you are a science student. So how can you accept this without questioning”. This would annoy some of my other relatives to no end (since they would end up having to answer lots of questions by me) but this might also explain why I’m less trusting of “traditional wisdom” than others of my generation.

Hooke’s Curve, hooking up and dressing sense

So Priyanka and I were talking about a mutual acquaintance, and the odds of her (the acquaintance) being in a relationship, or trying to get into one. I offered “evidence” that this acquaintance (who I meet much more often than Priyanka does) has been dressing progressively better over the last year, and from that evidence, it’s likely that she’s getting into a relationship.

“It can be the other way, too”, Priyanka countered. “Haven’t you seen countless examples of people who have started dressing really badly once they’re in a relationship?”. Given that I had several data points in this direction, too, there was no way I could refute it. Yet, I continued to argue that given what I know of this acquaintance, it’s more likely that she’s still getting into a relationship now.

“I can explain this using Hooke’s Law”, said Priyanka. Robert Hooke, as you know was a polymath British scientist of the seventeenth century. He has made seminal contributions to various branches of science, though to the best of my knowledge he didn’t say anything on relationships (he was himself a lifelong bachelor). In Neal Stephenson’s The Baroque Cycle, for example, Hooke conducts a kidney stone removal operation on one of the protagonists, and given the range of his expertise, that’s not too far-fetched.

“So do you mean Hooke’s Law as in stress is proportional to strain?”, I asked. Priyanka asked if I remembered the Hooke’s Curve. I said I didn’t. “What happens when you keep increasing stress?”, she asked. “Strain grows proportional until it snaps”, I said. “And how does the curve go then”, she asked. I made a royal mess of drawing this curve (didn’t help that in my mind I had plotted stress on X-axis and strain on Y, while the convention is the other way round).

After making a few snide remarks about my IIT-JEE performance, Priyanka asked me to look up the curve and proceeded to explain how the Hooke’s curve (produced here) explains relationships and dressing sense.

“As you get into a relationship, you want to impress the counterparty, and so you start dressing better”, she went on. “These two feed on each other and grow together, until the point when you start getting comfortable in the relationship. Once that happens, the need to impress the other person decreases, and you start wearing more comfortable, and less fashionable, clothes. And then you find your new equilibrium.

“Different people find their equilibria at different points, but for most it’s close to their peak. Some people, though, regress all the way to where they started.

“So yes, when people are getting into a relationship they start dressing better, but you need to watch out for when their dressing sense starts regressing. That’s the point when you know they’ve hooked up”, she said.

By this point in time I was asking to touch her feet (which was not possible since she’s currently at the other end of the world). Connecting two absolutely unrelated concepts – Hooke’s Law and hooking up, and building a theory on that. This was further (strong) confirmation that I’d married right!

Discrete and continuous diseases

Some three years or so back I got diagnosed with ADHD, and put on a course of Methylphenidate. The drug worked, made me feel significantly better and more productive, and I was happy that a problem that should have been diagnosed at least a decade earlier had finally been diagnosed.

Yet, there were people telling me that there was nothing particularly wrong with me, and how everyone goes through what are the common symptoms of ADHD. It is a fact that if you go through the ADHD questionnaire (not linking to it here), there is a high probability of error of commission. If you believer you have it, you can will yourself into answering such that the test indicates that you have it.

Combine this with the claim that there is heavy error of commission in terms of diagnosis and drugging (claims are that some 10% of American kids are on Methylphenidate) and it can spook you, and question if your diagnosis is correct. It doesn’t help matters that there is no objective diagnostic test to detect ADHD.

And then your read articles such as this one, which talks about ADHD in kids in Mumbai. And this spooks you out from the other direction. Looking at some of the cases mentioned here, you realise yours is nowhere as bad, and you start wondering if you suffer from the same condition as some of the people mentioned in the piece.

The thing with a condition such as ADHD is that it is a “continuous” disease, in that it occurs in different people to varying degrees. So if you ask a question like “does this person have ADHD” it is very hard to give a straightforward binary answer, because by some definitions, “everyone has ADHD” and by some others, where you compare people to the likes of the girl mentioned in the Mid-day piece (linked above), practically no one has ADHD.

Treatment also differs accordingly. Back when I was taking the medication, I used to take about 10mg of Methylphenidate per day. A friend, who is also on Methylphenidate and of a comparable dosage, informs me that there are people who are on the same drug at a dosage that is several orders of magnitude higher. In that sense, the medical profession has figured out the continuous nature of the problem and learnt to treat it accordingly (a “bug”, however, is that it is hard to determine optimal dosage first up, and it is done through a trial and error process).

The problem is that we are used to binary classification of conditions – you either have a cold or you don’t. You have a fever or you don’t (though arguably once you have a fever, you can have a fever to different degrees). You have typhoid or you don’t. And so forth.

So coming from this binary prior of classifying diseases, continuous diseases such as ADHD are hard to fathom for some people. And that leads to claims of both over and under medication, and it makes clinical research also pretty hard.

Do I have ADHD? Again it’s hard to give a binary answer to that. It depends on where you want to draw the line.

Narendra Modi and the Correlation Term

In a speech in Canada last night, Prime Minister Narendra Modi said that the relationship between India and Canada is like the “2ab term” in the formula for expansion of $(a+b)^2$ .

India and Canada are like that “2ab” that comes from the formula of (a+b)square: PM #ModiinCanada pic.twitter.com/1IpCJ0aQK8

— ANI (@ANI_news) April 16, 2015

Unfortunately for him, this has been widely lampooned on twitter, with some people seemingly not getting the mathematical reference, and others making up some unintended consequences of it.

In my opinion, however, it is a masterstroke, and brings to notice something that people commonly ignore – what I call as the “correlation term”. When any kind of break up or disagreement happens – like someone quitting a job, or a couple breaking up, or a band disbanding, people are bound to ask the question of whose fault it was. The general assumption is that if two entities did not agree, it was because both of them sucked.

However, considering the frequency at which such events (breakups or disagreements ) happen, and that people who are generally “good” are involved in such events, the badness of one of the parties involve simply cannot explain them. So the question arises – if both parties were flawless why did the relationship go wrong? And this is where the correlation term comes in!

It is rather easy to explain using vector calculus. If you have two vectors $A$ and $B$ , the magnitude of the sum of the two vectors is given by $\sqrt{|A|^2 + |B|^2 + 2 |A||B| cos \theta}$ where $|A|,|B|$ are the magnitudes of the two vectors respectively and $\theta$ is the angle between them. It is easy to see from the above formula that the magnitude of the sum of the vectors is dependent not only on the magnitudes of the individual vectors, but also on the angle between them.

To illustrate with some examples, if A and B are perfectly aligned ( $\theta = 0, cos \theta = 1$ ), then the magnitude of their vector sum is the sum of their magnitudes. If they oppose each other, then the magnitude of their vector sum is the difference of their magnitudes. And if A and B are orthogonal, then $cos \theta = 0$ or the magnitude of their vector sum is $\sqrt{|A|^2 + |B|^2}$ .

And if we move from vector algebra to statistics, then if A and B represent two datasets, the “ $cos \theta$ ” is nothing but the correlation between A and B. And in the investing world, correlation is a fairly important and widely used concept!

So essentially, the concept that the Prime Minister alluded to in his lecture in Canada is rather important, and while it is commonly used in both science and finance, it is something people generally disregard in their daily lives. From this point of view, kudos to the Prime Minister for bringing up this concept of the correlation term! And here is my interpretation of it:

At first I was a bit upset with Modi because he only mentioned “2ab” and left out the correlation term ( $\theta$ ). Thinking about it some more, I reasoned that the reason he left it out was to imply that it was equal to 1, or that the angle between the a and b in this case (i.e. India and Canada’s interests) is zero, or in other words, that India and Canada’s interests are perfectly aligned! There could have been no better way of putting it!

So thanks to the Prime Minister for bringing up this rather important concept of correlation to public notice, and I hope that people start appreciating the nuances of the concept rather than brainlessly lampooning him!

Sigma and normal distributions

I’m in my way to the Bangalore airport now, north of hebbal flyover. It’s raining like crazy again today – the second time in a week it’s raining so bad.

I instinctively thought “today is an N sigma day in terms of rain in Bangalore” (where N is a large number). Then I immediately realized that such a statement would make sense only if rainfall in Bangalore were to follow a normal distribution!

When people normally say something is an N sigma event what they’re really trying to convey is that it is a very improbable event and the N is a measure of this improbability. The relationship between N and the improbability implied is given by the shape of the normal curve.

However when a quantity follow a distribution other than normal the relationship between the mean and standard deviation (sigma) and the implied probability breaks down and the number of sigmas will mean something totally different in terms of the implied improbability.

It is good practice, thus, to stop talking in terms of sigma and talk in terms of of odds. It’s better to say “a one in forty event” rather than saying “two sigma event” (I’m assuming a one tailed normal distribution here).

The broader point is that the normal distribution is too ingrained in people’s minds which leads then to assume all quantities follow a normal distribution – which is dangerous and needs to be discouraged strongly.

In this direction any small measure – like talking odds rather than in terms of sigma – will go a long way!

The moving solstice

Today is “Makara Sankranti”. If the name doesn’t already strike you, “Makara” is the Sanskrit name for “Capricorn”. The Makara Sankranti is supposed to represent the Winter Solstice in the Northern Hemisphere, or the day when the Sun is directly over the Tropic of Capricorn.

However, we know that the winter solstice falls on the 21st or 22nd of December every year. Then why is it that the Indian version of the Winter Solstice falls on 15th of January?

I’m not sure if you remember, but a few years back, Makara Sankranti would usually fall on the 14th of January. After some back-and-forth movements, it has now settled on the 15th of January. You might have already noticed that this is unlike other Indian festivals such as Deepavali or Ganesh Chaturthi, whose dates according to the Gregorian calendar move every year (typically in a -11, -11, +19 cycle) over three years). This is because unlike Deepavali or Ganesh Chaturthi, which are observed according to the Lunar calendar, Makara Sankranti follows the solar calendar!

I recently read a book called “Solstice at Panipat”, about the third battle of Panipat in 1761 (my review is here). The Marathas went to battle four days after celebrating the Winter Solstice. The battle was fought on the 14th of January 1761, which means the solstice was observed that year on the 10th of January. So you see that the solstice, which is supposed to be observed on the 21/22 of December, was observed on 10th of January in 1761, and on the 15th of January in 2014.

This shows that there is an error in the Indian solar calendar. This error amounts to about 20 minutes a year, which means that the rate at which we are going, about 10000 years from now the Makara Sankranti (“Winter Solstice”) will fall in June, the middle of the summer!

That we know that the error in the Hindu solar calendar is 20 minutes a year allows us to calculate the last time the calendar was calibrated – we can date it to around 285 AD. Back in 285 AD, the calendar was calculated accurately, with the Winter Solstice falling on the actual Winter Solstice. After that, the calendar has drifted, and one can say, so has Indian science.

I’m informed, however, that this 20 minute error in the Hindu solar calendar is deliberate, and that this has been put in place for astrological reasons. Apparently, astrology follows a 26400 year cycle, and for that to bear out accurately, our solar calendar needs to have a 20 minute per year error! So for the last 1700 or so years, we have been using a calendar that is accurate for astrological calculations but not to seasons! Thankfully, the lunar calendar, which has been calibrated to the movement of stars, captures seasons more accurately!

I’ll end this post with a twitter conversation (I’m off twitter now, btw) where I learnt about this inaccuracy :

@karthiks @ainvvy @saurabhchandra See appendix 4, pg 271 of the book. I was fortunate to have Dr. Jayant Narlikar explain this to me.

— Uday S Kulkarni (@Mulamutha) December 28, 2013

Update: The link to the tweet doesn’t show the entire thread. See that here.

Update: Here is a piece by astrophysicist Jayant Narlikar on the Makara Sankranti. Basically due to a change in the earth’s axis, our divisions of the night sky into 12 constellations are not stationary, and hence the date when the sun moves from “Dhanur” to “Makara” is no longer the solstice date.

Festivals and memes

We don’t normally celebrate festivals. We don’t particularly enjoy them. The only festival we celebrate to some degree is Dasara, when we set up dolls and invite people home to view the dolls. Of course, the last couple of years it’s been similar arrangements and there hasn’t been much innovation in what we do, but we enjoy it as a process and hence take forward the festival. Last year, we even got some fireworks during Deepavali and burst them. Again – it was a fun element. We aren’t too enthused by rituals and since most other festivals are little more than rituals we don’t celebrate them.

The wife, however, sometimes have existential doubts. “There must be a reason that our ancestors celebrated these festivals”, she pops up from time to time, “so it may not be correct on our part to simply stop celebrating. We should take forward the tradition”. This is question that comes up each time we don’t celebrate a festival (which you might guess is fairly often). Before today I hadn’t been able to give a convincing reply either way – whether it makes sense to follow our instinct or if it’s a cultural duty to take forward the tradition.

Towards the end of his classic book The Selfish Gene, Richard Dawkins introduces the concept of the meme. In fact it was Dawkins who “invented” the concept of the meme. It is meant to be a cultural analogy to the gene, and it’s a “cultural’ concept that propagates like biological concepts are taken forward through the generations via genes. Given the multitude of so-called memes that keep popping up every other day, I’m sure all of you know what meme means. I’m just providing the context here since my argument depends on the original Dawkinsian definition of the meme.

Let us say that there is a genetic attribute I inherited from my father, let’s say it’s my height (my father was 5 feet 10 inches, and I’m an inch taller than that). Now, it is not necessary that this particular gene is passed on to my progeny. It is not even necessary that the corresponding gene from my wife gets passed on – there might be a mutation there and despite the wife and I being fairly tall (by Indian standards) we cannot rule out producing a short child. The point I’m trying to make is that while genes propagate, not every trait needs to pass on from you to your offspring. Only a few traits (chosen more or less at random when your and your gene-propagating partner’s genes undergo meiosis) get passed on. Yet, through the network of you and your siblings and cousins and extended family, the family’s genetic code gets passed on.

Now, festivals and other cultural practices can be described as memes. We in the Indian society have a set of memes, which are called “Ganesh Chaturthi”, “Deepavali”, etc. That these memes have survived through the generations shows their strength – who knows about festivals that had been invented but didn’t survive. Now, the fact that we have inherited this meme doesn’t necessarily mean that we need to propagate it. Unlike genetics, the choice here is not random combination – it is our personal choice (we can’t decide what genes our offspring inherits from either of us or through a mutation).

So, just like every genetic trait doesn’t need to be propagated from a parent to an offspring, not every cultural trait needs to be passed on. If I were to pass on every cultural trait I inherit irrespective of whether it is desirable, even when circumstances change, undesirable cultural traits continue to exist. This is not efficient. As a society, we have bandwidth only for a certain number of cultural traits, and if traits are passed on without much thought, the bad ones won’t die. And will crowd out the good ones.

So if you were to look at it in terms of responsibility to society, you need to propagate only those cultural traits that you deem to be relevant and important. “So what if everyone stops celebrating Ganesh Chaturthi?” you may ask. If that would happen that would simply mean a vote of no confidence for the festival and an indication that the festival needs to be phased out. If everyone were to propagate only those cultural traits they find useful, traits that a significant proportion of society finds significant will continue to survive and thrive. For Ganesh Chaturthi to exist 30 years hence, it isn’t necessary for ALL families that have inherited it to celebrate it now. As long as a critical mass of families celebrate it, the festival will survive. If not, it probably doesn’t need to exist.

(the choice of Ganesh Chaturthi for illustration is purely driven by the fact that the festival is today).

Bayesian Recognition

We don’t meet often, but every time we talk, she reminds me that I had failed to recognize her the first time we had met after graduating together from school. Yes, I could claim in my defence that I was seeing her for the first time in over six years. While that might be a valid excuse for most people, it doesn’t apply to me, since I normally claim to have superior long-term memory. If I’ve seen you somewhere before, I ought to recognize you. The only times I don’t I’m pretending, since I don’t want to embarrass you (and myself) by recognizing you while you don’t recognize me (see this incident for an example of this).

The reason for my failure that cold Bangalore evening in December 2006 was that my Bayesian system had failed me. Let me explain, in the process giving you an insight into my Bayesian system which I use to recognize you when I meet you.

About a month or two back, I was at a friend’s wedding, which is where I hit upon this term “Bayesian recognition” to explain this phenomenon (which I’ve been practicing for ages). Now, this friend whose wedding I was attending was one year my junior at two different schools. As you might expect at an event where you and the host share more than one social network, there were a lot of familiar faces. Some people I knew fairly well, and could easily recognize. But the others had to go through a “Bayesian search”.

So when I saw someone who was one of three people I know – let’s say X, Y and Z. In order to determine which of these this person is, I would ask myself two questions – firstly, what were the prior odds that the person I saw could be each of X, Y or Z. Secondly, what were the odds of each of X, Y and Z being there at that event. Note that the latter is important. For example, if someone at the event looks like you and I know (for example) that you are currently in another country, despite the strong resemblance I can discount the possibility that that person is you, and go ahead with my search.

Note that this differs from “frequentist recognition”, where I only look at the person’s face and try and understand who he/she most resembles, without any thought to the odds that that person is there. Frequentist recognition can lead to a large number of false positives, and after a few rounds of embarrassment, you start giving up on recognizing, and many a possible reunion thus gets missed. Bayesian recognition, on the other hand, restricts your field of search (to the people who you give good odds of being there), prevents you from being distracted and increases your chances of making a good recognition.

So why did Bayesian recognition fail me when I met this former classmate back in 2006? The problem was her company. She had come for this Deep Purple concert with another friend of mine, who was my classmate in another school (and who I had been in touch with, and so easily recognized). I had no clue that these two were friends (it turned out they didn’t know each other that well – they had come there with a common friend). So when this girl (the one I didn’t recognize) popped up with “Hey SK! Do you remember me?” I assumed that she was someone I knew from the same school as the other girl I was meeting, and that wrongly restricted my search space. And so my mind was trying to map her to my friends from school 1, while she happened to be a friend from school 2. And my search returned a blank, and my legendary long-term memory skills were embarrassed.

I must mention here, though, that this is possibly the only time that my Bayesian recognition model has acted up, and refused to recognize someone I know. There have been 2-3 false positives, but this has been the only negative. And when you consider the sample size to be all the people I have recognized in different places, this is small indeed.

Oh, and after failing to recognize her then, I’ve kept in touch with this friend.