Structures of professions and returns to experience

I’ve written here a few times about the concept of “returns to experience“. Basically, in some fields such as finance, the “returns to experience” is rather high. Irrespective of what you have studied or where, how long you have continuously been in the industry and what you have been doing has a bigger impact on your performance than your way of thinking or education.

In other domains, returns to experience is far less. After a few years in the profession, you would have learnt all you had to, and working longer in the job will not necessarily make you better at it. And so you see that the average 15 years experience people are not that much better than the average 10 years experience people, and so you see salaries stagnating as careers progress.

While I have spoken about returns to experience, till date, I hadn’t bothered to figure out why returns to experience is a thing in some, and only some, professions. And then I came across this tweetstorm that seeks to explain it.

Now, normally I have a policy of not reading tweetstorms longer than six tweets, but here it was well worth it.

It draws upon a concept called “cognitive flexibility theory”.

Basically, there are two kinds of professions – well-structured and ill-structured. To quickly summarise the tweetstorm, well-structured professions have the same problems again and again, and there are clear patterns. And in these professions, first principles are good to reason out most things, and solve most problems. And so the way you learn it is by learning concepts and theories and solving a few problems.

In ill-structured domains (eg. business or medicine), the concepts are largely the same but the way the concepts manifest in different cases are vastly different. As a consequence, just knowing the theories or fundamentals is not sufficient in being able to understand most cases, each of which is idiosyncratic.

Instead, study in these professions comes from “studying cases”. Business and medicine schools are classic examples of this. The idea with solving lots of cases is NOT that you can see the same patterns in a new case that you see, but that having seen lots of cases, you might be able to reason HOW to approach a new case that comes your way (and the way you approach it is very likely novel).

Picking up from the tweetstorm once again:

 

It is not hard to see that when the problems are ill-structured or “wicked”, the more the cases you have seen in your life, the better placed you are to attack the problem. Naturally, assuming you continue to learn from each incremental case you see, the returns to experience in such professions is high.

In securities trading, for example, the market takes very many forms, and irrespective of what chartists will tell you, patterns seldom repeat. The concepts are the same, however. Hence, you treat each new trade as a “case” and try to learn from it. So returns to experience are high. And so when I tried to reenter the industry after 5 years away, I found it incredibly hard.

Chess, on the other hand, is well-structured. Yes, alpha zero might come and go, but a lot of the general principles simply remain.

Having read this tweetstorm, gobbled a large glass of wine and written this blogpost (so far), I’ve been thinking about my own profession – data science. My sense is that data science is an ill-structured profession where most practitioners pretend it is well-structured. And this is possibly because a significant proportion of practitioners come from academia.

I keep telling people about my first brush with what can now be called data science – I was asked to build a model to forecast demand for air cargo (2006-7). The said demand being both intermittent (one order every few days for a particular flight) and lumpy (a single order could fill up a flight, for example), it was an incredibly wicked problem.

Having had a rather unique career path in this “industry” I have, over the years, been exposed to a large number of unique “cases”. In 2012, I’d set about trying to identify patterns so that I could “productise” some of my work, but the ill-structured nature of problems I was taking up meant this simply wasn’t forthcoming. And I realise (after having read the above-linked tweetstorm) that I continue to learn from cases, and that I’m a much better data scientist than I was a year back, and much much better than I was two years back.

On the other hand, because data science attracts a lot of people from pure science and engineering (classically well-structured fields), you see a lot of people trying to apply overly academic or textbook approaches to problems that they see. As they try to divine problem patterns that don’t really exist, they fail to recognise novel “cases”. And so they don’t really learn from their experience.

Maybe this is why I keep saying that “in data science, years of experience and competence are not correlated”. However, fundamentally, that ought NOT to be the case.

This is also perhaps why a lot of data scientists, irrespective of their years of experience, continue to remain “junior” in their thinking.

PS: The last few paragraphs apply equally well to quantitative finance and economics as well. They are ill-structured professions that some practitioners (thanks to well-structured backgrounds) assume are well-structured.

Resorts

We spent the last three days at a resort, here in Karnataka. The first day went off very peacefully. On the second day, a rather loud group checked in. However, our meal times generally didn’t intersect with theirs and they weren’t too much of a bother.

Yesterday, a bigger and louder (and rather obnoxious – they were generally extremely rude to the resort staff) group checked in. Unfortunately their meal times overlapped with ours, and their unpleasantness had a bearing on us. Our holiday would have been far better had this group not checked in to our resort, but there was no way we could have anticipated, or controlled for that.

The moral of the story, basically, is that your experience at a resort is highly dependent on who else is checked in to the resort at the same time.

The thing with resorts is that unlike “regular hotels”, you end up spending all your time during your holiday in the resort itself, so the likelihood of bumping into or otherwise encountering others who are staying at the resort is far higher. And this means that if you don’t want to interact with some of the people there, you sometimes don’t really have a choice.

Of course, it helped that the resort we were in had private swimming pools attached to each room, and was rather large. So the only times we encountered the other groups at the resort was at meal times. However, as we found during our last day there, that itself was enough to make the experience somewhat unpleasant.

My wife and I had a long conversation last night on what we could do to mitigate this risk. We wondered if the resorts we have been going to are “not premium enough” (then again, a resort with private swimming pools in each room can be considered to be as premium as it gets). However, we quickly realised that ability to pay for a holiday is not at all correlated with pleasantness.

We wondered if resorts that are out of the way or in otherwise not so popular places are a better hedge against this. Now, with smaller or less popular resorts, the risk of having unpleasant co-guests is smaller (since the number of co-guests is lower). However, if one or more of the co-guests happens to be unpleasant, it will impact you a lot more. And that’s a bit of a risk.

Maybe the problem is with India, we thought, since one of the nice resort holidays we’ve had in the last couple of years was in Maldives. Then again, we quickly remembered the time at Taj Bentota (on our honeymoon) where the swimming pool had been taken over by a rather loud tour group, driving us nuts (and driving us away to the beach).

We thought of weekday vs weekend. Peak season vs off season. School holidays vs exam season. We were unable to draw any meaningful correlations.

There is no solution, it seemed. Then we spent time analysing why we didn’t get bugged by fellow-guests at Maldives (my wife helpfully remembered that the family at the table next to ours at one of the dinners was rather loud and obnoxious). It had to do with size. It was a massive resort. Because the resort was so massive, there would be other guests who were obnoxious. However, in the size of the resort, they would “become white noise”.

So, for now, we’ve taken a policy decision that for our further travel in India, we’ll either go to really large resorts, or we’ll do a “tourist tour” (seeing places, basically) while staying at “business hotels”. This also means that we’re unlikely to do another multi-day holiday until Covid-19 is well under control.

Postscript: Having spent a considerable amount of time in the swimming pool attached to our room, I now have a good idea on why public swimming pools haven’t yet been opened up post covid-19. Basically, I found myself blowing my nose and spitting into the pool a fair bit during the time when I was there. Since the only others using it at that time were my immediate family, it didn’t matter, but this tells you why public swimming pools may not be particularly safe.

Postscript 2: One other problem we have with Indian resorts is the late dinner. At home, we adults eat at 6pm (and our daughter before that). Pretty much every resort we’ve stayed in over the last year and half has started serving dinner only by 8, or sometimes at 9pm. And this has sort of messed with our “systems”.

Big Data and Fast Frugal Trees

In his excellent podcast episode with EconTalk’s Russ Roberts, psychologist Gerd Gigerenzer introduces the concept of “fast and frugal trees“. When someone needs to make decisions quickly, Gigerenzer says, they don’t take into account a large number of factors, but instead rely on a small set of thumb rules.

The podcast itself is based on Gigerenzer’s 2009 book Gut Feelings. Based on how awesome the podcast was, I read the book, but found that it didn’t offer too much more than what the podcast itself had to offer.

Coming back to fast and frugal trees..

In recent times, ever since “big data” became a “thing” in the early 2010s, it is popular for companies to tout the complexity of their decision algorithms, and machine learning systems. An easy way for companies to display this complexity is to talk about the number of variables they take into account while making a decision.

For example, you can have “fin-tech” lenders who claim to use “thousands of data points” on their prospective customers’ histories to determine whether to give out a loan. A similar number of data points is used to evaluate resumes and determine if a candidate should be called for an interview.

With cheap data storage and compute power, it has become rather fashionable to “use all the data available” and build complex machine learning models (which aren’t that complex to build) for decisions that were earlier made by humans. The problem with this is that this can sometimes result in over-fitting (system learning something that it shouldn’t be learning) which can lead to disastrous predictive power.

In his podcast, Gigerenzer talks about fast and frugal trees, and says that humans in general don’t use too many data points to make their decisions. Instead, for each decision, they build a quick “fast and frugal tree” and make their decision based on their gut feelings about a small number of data points. What data points to use is determined primarily based on their experience (not cow-like experience), and can vary by person and situation.

The advantage of fast and frugal trees is that the model is simple, and so has little scope for overfitting. Moreover, as the name describes, the decision process is rather “fast”, and you don’t have to collect all possible data points before you make a decision. The problem with productionising the fast and frugal tree, however, is that each user’s decision making process is different, and about how we can learn that decision making process to make the most optimal decisions at a personalised level.

How you can learn someone’s decision-making process (when you’ve assumed it’s a fast and frugal tree) is not trivial, but if you can figure it out, then you can build significantly superior recommender systems.

If you’re Netflix, for example, you might figure that someone makes their movie choices based only on age of movie and its IMDB score. So their screen is customised to show just these two parameters. Someone else might be making their decisions based on who the lead actors are, and they need to be shown that information along with the recommendations.

Another book I read recently was Todd Rose’s The End of Average. The book makes the powerful point that nobody really is average, especially when you’re looking a large number of dimensions, so designing for average means you’re designing for nobody.

I imagine that is one reason why a lot of recommender systems (Netflix or Amazon or Tinder) fail is that they model for the average, building one massive machine learning system, rather than learning each person’s fast and frugal tree.

The latter isn’t easy, but if it can be done, it can result in a significantly superior user experience!

Experience and Cows

A lot of people make a big deal about experience. If some people (and some companies) are to be believed, the number of years in a job should be the only criterion of what someone needs to be paid and whether they deserve to be promoted.

However, not all experience is created equal. Experience matters when you are learning on the job, and where you learn the patterns that are inherent in your job, and you can over time replace your “slow thinking” about the job with more “fast thinking”.

If you continue to do the same thing in the same way throughout the years of experience, not bothering to figure out why things are done certain ways, and how things can be done better, the experience isn’t of that much use.

I leave it to former Tottenham Hotspur manager Mauricio Pochettino to explain this concept with a beautiful and profound analogy (there’s a video in this link which I’m somehow unable to embed here).

It is like a cow that, every day in 10 years, sees the train cross in front at the same time.

If you ask the cow, ‘what time is the train going to come’, it is not going to know the right answer.

In football, it is the same. Experience, yes, but hunger, motivation, circumstance, everything is so important.

It is unfortunate that the journalist who covered this story for Sky Sports thought this analogy was bizarre. Maybe he has been doing his job reporting on press conferences in the same way a cow sees a train passing by at a particular time every day?