Bayesian recognition in baby similarity

When people come to see small babies, it’s almost like they’re obliged to offer their opinions on who the child looks like. Most of the time it’s an immediate ancestor – either a parent or grandparent. Sometimes it could be a cousin or aunt or uncle as well. Thankfully it’s uncommon to compare babies’ looks to those who they don’t share genes with.

So as people have come up and offered their opinions on who our daughter looks like (I’m top seed, I must mention), I’ve been trying to analyse how they come up with their predictions. And as I observe the connections between people making the observations, and who they mention, I realise that this too follows some kind of Bayesian Recognition.

Basically different people who come to see the baby have different amounts of information on how each of the baby’s ancestors looked like. A recent friend of mine, for example, will only know how my wife and I look. An older friend might have some idea of how my parents looked. A relative might have a better judgment of how one of my parents looked than how I looked.

So based on their experiences in recognising different people in and around the baby’s immediate ancestry, they effectively start with a prior distribution of who the baby looks like. And then when they see the baby, they update their priors, and then mention the person with the highest posterior probability of matching the baby’s face and features.

Given that posterior probability is a function of prior probability, there is no surprise that different people will disagree on who the baby looks like. After all, each of their private knowledge of the baby’s ancestry’s idiosyncratic faces, and thus their priors, will be different!

Unrelated, but staying on Bayesian reasoning, I recently read this fairly stud piece in Aeon on why stereotyping is not necessarily a bad thing. The article argues that in the absence of further information, stereotypes help us form a good first prior, and that stereotypes only become a problem if we fail to update our priors with any additional information we get.

Priors and posteriors

There is a fundamental difference between version 1.0 of any thing and any subsequent version. In the version 1.0, you usually don’t need to give any reasons for your choices. The focus in that case would be in getting the version ready, and you can get away with whatever assumptions you want to feel like. Nobody will question you because first of all they want to see your product out, and not delay it with “class participation”. The prior thus gets established.

Now, for any subsequent version, if you suggest a change, it will be evaluated against what is already there. You need to do a detailed scientific analysis into the switching costs and switching benefits, and make a compelling enough case that the change should be made. Even when it is a trivial change, you can expect it to come under a lot of scrutiny, since now there is a “prior”, a “default” which people can fall back on if they don’t like what you suggest.

People and products are resistant to change. Inertia exists. So if you want to make a mark, make sure you’re there at version 1.0. Else you’ll get caught in infintely painful bureaucratic hassles. And given the role of version 1.0 into how a product pans out (in the sense that most of the assumptions made there never really get challenged) I think the successful products are those that got something right initially, which made better assumptions than the others.