Chat GPT and Information Content

For the last week or so I’ve been fooling around with Chat GPT. It’s been a lot of fun, if you can endure the verbosity of its answers. Yesterday, at work, I used it for tech support when some people I was talking to were using too much jargon.

Before I came home from work yesterday, my last conversation was with a couple of recruiters, about a role that I’m hiring for. So when I got home, on a whim, I wondered if I could check out of Chat GPT could write job descriptions for me.

I asked:

write a job description for a senior data scientist at India’s largest logistics company. 4-7 years exp. Needs to have experience in ML, OR and statistics, and be able to work close to business and on ambiguous problems

It answered (pasting in full since the screenshot above is limited):

Job Title: Senior Data Scientist

Company: India’s Largest Logistics Company

Location: India

Experience: 4-7 years

We are seeking an experienced and highly skilled Senior Data Scientist to join our team at India’s largest logistics company. In this role, you will be responsible for using machine learning, operations research, and statistics to drive business insights and decision making.

As a Senior Data Scientist, you will have a deep understanding of data science techniques and technologies, and will be able to apply them to complex and ambiguous problems. You will be able to work closely with business teams to understand their needs and to develop solutions that drive value for the company.

To be successful in this role, you should have at least 4-7 years of experience in data science, and should have a strong background in machine learning, operations research, and statistics. You should be able to work independently and be comfortable working on complex and ambiguous problems.

Responsibilities:

  • Use machine learning, operations research, and statistics to drive business insights and decision making
  • Work closely with business teams to understand their needs and develop solutions
  • Develop and maintain data pipelines and systems
  • Collaborate with other data scientists and cross-functional teams to develop and implement solutions

Qualifications:

  • Bachelor’s or Master’s degree in a quantitative field, such as computer science, statistics, or engineering
  • At least 4-7 years of experience in data science
  • Strong background in machine learning, operations research, and statistics
  • Ability to work independently and on complex and ambiguous problems
  • Excellent communication and collaboration skills
  • Experience with Python, R, or other programming languages is a plus

We offer a competitive salary and benefits package, as well as opportunities for growth and advancement. If you are an experienced data scientist with a passion for driving business insights and decision making, we encourage you to apply.

In the past when I’ve been hiring, I’ve been a bit annoyed by people asking for job descriptions. I’ve largely found them to be fluff and boilerplate, and I’ve tried to get away with a sentence or two. However, due to the insistence of recruiters and candidates alike, I’ve ended up writing elaborate job descriptions. I don’t need to any more.

This post, however, is not about hiring (but yeah – I am hiring, and the above job description is reasonably accurate). It’s about information content in general.

Traditionally, information content has been measured in bits. A bit is the amount of information contained in an equi-probable binary event.

Sometimes when we find that someone is too verbose and using too many words when fewer would suffice, we say that their bit rate is low. We also use “low bit rate” to describe people such as former Prime Minister Atal Behari Vajpayee, who would speak incredibly slowly.

However, beyond the bit, which is a fairly technical concept, it has been difficult to quantify information content. Sometimes you read an article or a story and find that there is nothing much to it. But given the natural language, and the context of various words, it is impossible to quantify the information content.

Now, with Chat GPT, maybe it becomes a bit easier (though one would need a “reverse chat GPT algo”, to find the set of prompts required for Chat GPT to churn out a particular essay). Above, for example, I’ve shown how much fluff there generally is to the average job description – a fairly short prompt generated this longish description that is fairly accurate.

So you can define the information content of a piece or essay in terms of the number of words in the minimum set of prompts required for Chat GPT (or something like it) to come up with it. If you are a boring stereotypical writer, the set of prompts required will be lower. If you are highly idiosyncratic, then you will need to give a larger number of prompts for Chat GPT to write like you. You know where I’m going.

This evening, in office, a colleague commented that now it will be rather easy to generate marketing material. “Even blogs might become dead, since with a few prompts you can get that content”, he said (it can be a legit service to build off the Chat GPT API to take a tweet and convert it into an essay).

I didn’t tell him then but I have decided to take it up as a challenge. I consider myself to be a fairly idiosyncratic writer, which means I THINK there is a fair bit of information content in what I write, and so this blog will stay relevant. Let’s see how it goes.

PS: I still want to train a GAN on my blog (well over a million words, at last count) and see how it goes. If you know of any tools I can use for this, let me know!

 

Super Deluxe

In my four years in Madras (2000-4), I learnt just about enough Tamil to watch a Tamil movie with subtitles. Without subtitles is still a bit of a stretch for me, but the fact that streaming sites offer all movies with subtitles means I can watch Tamil movies now.

At the end, I didn’t like Super Deluxe. I thought it was an incredibly weird movie. The last half an hour was beyond bizarre. Rather, the entire movie is weird (which is good in a way we’ll come to in a bit), but there is a point where there is a step-change in the weirdness.

The wife had watched the movie some 2-3 weeks back, and I was watching it on Friday night. Around the time when she finished the movie she was watching and was going to bed, she peered into my laptop and said “it’s going to get super weird now”. “As if it isn’t weird enough already”, I replied. In hindsight, she was right. She had peered into my laptop right at the moment when the weirdness goes to yet another level.

It’s not often that I watch movies, since most movies simply fail to hold my attention. The problem is that most plots are rather predictable, and it is rather easy to second-guess what happens in each scene. It is the information theoretic concept of “surprise”.

Surprise is maximised when the least probable thing happens at every point in time. And when the least probable thing doesn’t happen, there isn’t a story, so filmmakers overindex on surprises and making sure the less probable thing will happen. So if you indulge in a small bit of second order thinking, the surprises aren’t surprising any more, and the movie becomes boring.

Super Deluxe establishes pretty early on that the plot is going to be rather weird. And when you think the scene has been set with sufficient weirdness in each story (there are four intertwined stories in the movie, as per modern fashion), the next time the movie comes back to the story, the story is shown to get weirder. And so you begin to expect weirdness. And this, in a way, makes the movie less predictable.

The reason a weird movie is less predictable is that at each scene it is simply impossible for the view to even think of the possibilities. And in a movie that gets progressively weirder like this one, every time you think you have listed out the possibilities and predicted what happens, what follows is something from outside your “consideration set”. And that keeps you engaged, and wanting to see what happens.

The problem with a progressively weird movie is that at some point it needs to end. And it needs to end in a coherent way. Well, it is possible sometimes to leave the viewer hanging, but some filmmakers see the need to provide a coherent ending.

And so what usually happens is that at some point in time the plot gets so remarkably simplified that everything suddenly falls in place (though nowhere as beautifully as things fall in place at the end of a Wodehouse novel). Another thing that can happen is that weirdness it taken up a notch, so that things fall in place at a “meta level”, at which point the movie can end.

The thing with Super Deluxe is that both these things happen! On one side the weirdness is taken up several notches. And on the other the plots get so oversimplified that things just fall in place. And that makes you finish the movie with a rather bitter taste in the mouth, feeling thoroughly unsatisfied.

That the “ending” of the movie (where things get really weird AND really simplified) lasts half an hour doesn’t help matters.

The high cost of “relaxing” activities

So I have a problem. I can’t seem to enjoy movies any more. I’ve written about this before. My basic problem is that I end up double-guessing the plots of most movies that I watched (how many storylines are there anyways? According to Kurt Vonnegut, there are six story arcs).

So as I watch movies, I know exactly what is going to happen. And just continuing to watch the movie waiting for that to happen is simply a waste of time – it adds no information content to me.

The result is that I’m extremely selective about the kinds of movies I watch. Some genres, such as Westerns, work because even if the stories may be predictable, the execution and the manner of execution are not, and that makes for interesting watching.

Then, of course, there are directors who have built up a reputation of being “offbeat”, where you can expect that their movies don’t follow expected story arcs – their movies have enough information content to make them worth watching.

And most “classic” movies (take any of the IMDB Top 250, for example) have stories that are told in an extremely compelling fashion – sometimes you might know what happens, but the way things are built up implies that you don’t want to miss watching it happening.

Now, all this is fine, and something I’ve written about before. The point of this post is that while I feel this way about movies, my wife doesn’t feel the same way. She watches pretty much anything, even if the stories are utterly predictable.

For example, she’s watched at least a 100 Telugu movies (though, admittedly, during a particularly jobless stretch in her MBA when she was watching loads of movies, even she got bored of the predictability of Telugu movies and switched to Tamil instead!). She likes to watch endless reruns of 90s Kannada movies that now appear rather lame (to me). She especially loves chick flicks, which I think have excess redundancy built into them for a very specific reason.

I don’t have a problem with any of this! In fact, I’m damn happy that she has a single-player hobby that enables her to keep herself busy when she’s bored. The only little problem I have is that she believes it is romantic to watch movies together. She might sell video for Amazon for a living, but she surely is a fan of “netflix and chill” (more the literal meaning than the euphemistic one).

And that is a problem for me, since I find the vast majority of movies boring and predictable, and she thinks the kind of movies I like are “too serious” and “not suitable for watching together” – an assessment I don’t disagree with (though I did make her watch For a Few Dollars More with me a couple of months back).

I’d prefer to spend our time together not spent in talking doing other activities – reading, for example (reading offers significantly higher throughput than movies, and that, I think, is a result of formats of several lengths being prevalent – newspaper articles, longform articles, books, etc.). I’ve offered to watch movies with her on the condition that I read something at the same time – an offer that has been soundly rejected (and I understand her reasons for that).

And so we reach a deadlock, and it repeats every time when we have time and want to chill. She wants to watch movies together. I initially agree, and then back out when presented with a choice of movies to watch. Sometimes I put myself through it, thoroughly not enjoying the process. Other times, much to her disappointment, we end up not watching.

Clearly there are no winners in this game!

 

 

The one bit machine

My daughter is two weeks old today and she continues to be a “one bit machine”. The extent of her outward communication is restricted to a maximum of one bit of information. There are basically two states her outward communication can fall under – “cry” and “not cry”, and given that the two are not equally probable, the amount of information she gives out is strictly less than one bit.

I had planned to write this post two weeks back, the day she was born, and wanted to speculate how long it would take for her to expand her repertoire of communication and provide us with more information on what she wants. Two weeks in, I hereby report that the complexity of communication hasn’t improved.

Soon (I don’t know how soon) I expect her to start providing us more information – maybe there will be one kind of cry when she’s hungry, and another when she wants her diaper changed. Maybe she’ll start displaying other methods of outward communication – using her facial muscles, for example (right now, while she contorts her face in a zillion ways, there is absolutely no information conveyed), and we can figure out with greater certainty what she wants to convey.

I’m thinking about drawing a graph with age of the person on the X axis, and the complexity of outward information on the Y axis. It starts off with X = 0 and Y = 1 (I haven’t bothered measuring the frequency of cry/no-cry responses so let’s assume it’s equiprobable and she conveys one bit). It goes on to X = 14 days and Y = 1 (today’s state). And then increases with time (I’m hoping).

While I’m sure research exists some place on the information content per syllable in adult communication, I hope to draw this graph sometime based on personal observation of my specimen (though that would limit it to one data point).

Right now, though, I speculate what kind of shape this graph might take. Considering it has so far failed to take off at all, I hope that it’ll be either an exponential (short-term good but long-term I don’t know ) or a sigmoid (more likely I’d think).

Let’s wait and see.

Information gain from relationship attempts

Every failed relationship (or attempt at a relationship) has plenty to teach you – in terms of things you got right, or wrong. Things that would make you cringe later on, and others that would make you wonder why the relationship failed. Each failed relationship (or attempt) helps you recalibrate yourself as a person – in terms of what kind of people to go after, and what kind of strategies to adopt during the process. Thus, a relationship is important not only from the direct joy it provides you, but also in terms of learnings for future relationships.

The standard model about “finding your level” in terms of determining your expectations from a potential partners involves trial and error. You “sample” by hitting on someone who you think might be a good fit. If it goes well, story ends. Else, you “learn” from this experience and hit on someone else.

How good a learner you are determines how many attempts you’ll take to find someone “your level” who is a “good fit” and end up in a great relationship. Yet, the kind of attempts you make puts a natural cap on the amount of information you extract from the attempt.

For example, there might be a potential counterparty with whom you have an extremely low (close to nothing) chance of getting into a relationship. Conventional wisdom says that you shouldn’t attempt hitting on her (to avoid pronoun confusion, let’s assuming that everyone you can hit on is feminine. Adjust accordingly if your preferences vary), for the odds are stacked against.

While this is good enough reason not to attempt that relationship (though sometimes the downside might be low enough for you to take a punt), the other problem is that you don’t learn anything from it. The extremely low prior probability of succeeding would mean that there is no information from this that can help tune your system. So you’re wasting your time in more than one way.

It works the other way also. Let’s say there’s someone who really looks up to you and wants to be in a relationship with you. You know that all it takes for you to get into a relationship with her is to express interest. If you know the relationship will add value to you, go ahead. However, it is absolutely useless in terms of your “find your level” – the extremely high prior probability means it won’t add sufficient value to the process.

So while they say that someone who’s been through failed relationships (or attempts at relationships) is experienced and has a more refined set of expectations, the sheer number matters less than the quality. It is the amount of information you’ve been able to extract from each such relationship (or attempt). A one-sided (where one of you is clearly “out of the league” of the other, doesn’t matter who is who) relationship doesn’t add much value.

Happy Valentine’s Day!