Channel Coding Theorem in Real Life

One of my favourite concepts in Computer Science is Shannon’s Channel Coding Theorem. This theorem is basically about the efficiency of communication over a noisy channel. And as I was thinking a few minutes back, this has interesting implications in real life as well, well away from the theory of communication.

I don’t have that much understanding of the rigorous explanation of the theorem. However, I absolutely love the central idea of it – that the noisier a channel is, the more the redundancy you need in your communication, and thus the slower is your communication. A corollary of this is that every channel has a “natural maximum speed”, and as long as you try to communicate within that speed, you can communicate reliably.

I won’t go into the technical details here – that involves assuming that the channel loses (or garbles) X% of bits, and then constructing a redundant code that shows that even with this loss, you can communicate effectively.

Anyway, let’s leave behind the theory communication and go on to real life.

I’ve found that I communicate badly when I’m not sure what language to talk in. If I’m talking in English with someone who I know knows good English, I communicate rather well (like my writing 😛 ) . However, if I’m not sure about the quality of language of the other person, I hesitate. I try to force myself to find simpler / more obvious words, and that disturbs my flow of thought, and I stammer.

Similarly, when I’m not sure whether to talk in Kannada or English (the two languages I’m very comfortable in), I stammer heavily. Again, because I’m not sure if the words I would naturally use will be understood by the other person (the counterparty’s comprehension being the “noise in the channel” here), I slow down, get jittery, and speak badly.

Then of course, there is the very literal interpretation of the channel coding theorem – when your internet connection (or call quality in general) is bad, you end up having to speak slower. When I was hunting for a job in 2020, I remember doing badly in a few interviews because of the quality (or lack thereof) of the internet connection (this was before I had discovered that Google Meet performs badly on Safari).

Similarly, sometime last month, I had thought I had prepared well for what I thought was going to be a key conversation at work. The internet was bad, we couldn’t hear each other and  kept repeating (redundancy is how you overcome the noise in the channel), and that diminished throughput massively. Given the added difficulty in communication, I didn’t bring up the key points I had prepared for. It was a damp squib.

Related to this is when you aren’t sure if the person you are speaking to can hear clearly. This disability again clouds the communication channel, meaning you need to build in redundancy, and thus a reduction in throughput.

When you are uncertain of yourself, or underconfident, you end up tending to do badly. That is because when you are uncertain, you aren’t sure if the other person will fully understand what you are going to say. Consequently, you end up talking slower, building redundancy in your speech, etc. You are more doubtful of what you are going to say, and don’t take risks, since your lack of confidence has clouded the “communication channel”, thus depressing your throughput.

Again a lot of this might apply to me alone – I function best when I’m talking / writing at a certain minimum throughput, and operating at anywhere below that makes me jittery and underconfident and a bad communicator. It is no surprise that my writing really took off once I got a computer of my own.

That was in the beginning of July 2004, and within a month, I had started (the predecessor of) this blog. I’ve been blogging for 19 years now.

That aside aside, the channel coding  theorem works in non-verbal contexts as well. Back in 2016, before my daughter was born, I remember reading somewhere that tentative mothers lead to cranky babies. The theory was that if the mum was anxious or afraid while handling her baby, the baby wouldn’t perceive the signals of touch sufficiently, and being devoid of communication, become cranky.

We had seen a few examples of this among relatives and friends (and this possibly applies to me as well – my mother had told me that I was the first newborn she ever handled, and so she was a bit tentative in handling me). This again can be explained using the Channel Coding Theorem.

When the mother’s touch is tentative, it is as if the touchy channel between mother and child has some “noise”. The tentativeness of the touch means the baby is not really sure of what the mother is “saying”. With touch, unlike language or bits, redundancy is harder. And so the child goes up insufficiently connected to its mother.

Conversely, later on in life, these tentative mothers tend to bring in redundancy in their communications with their (now jittery) children, and end up holding them too hard, and not letting them go (and some of these children go to therapists, who inevitably blame it on the mothers 😛 ). Ultimately, all of this stems from the noise in the initial communication channel (thanks to the tentativeness of the source).

Ok I’ve rambled on here, so will stop now. However, now that I’ve seeded this thought in you, you too will start seeing the channel coding theorem everywhere (oh – if you think this post is badly written, then that is again like reading this over a noisy channel. And you will get irritated with the lack of throughput and pack).

Tamil

I’m not great at languages. The two languages I can speak fluently in – Kannada and English – I learnt them both before I was four.

I learnt Hindi in school but speak a mix of highly sanskritised Hindi (textbook Hindi) and bombay Hindi (from movies) with a thick kannada accent. As for other languages, the less I say the better.

I spent four years studying in Tamil Nadu so sometimes people assume I know Tamil (it also has to do with my name and face, I guess!). The truth, however, is that by the time I graduated from IIT Madras, I had barely learnt to distinguish between Tamil and Telugu – the two “new” languages i had been massively exposed to during my time there.

Basically I didn’t bother learning Tamil when I lived in madras in 2000-4 because I didn’t really need to. Most people on campus spoke at least basic English. Most outsiders I interacted with were shopkeepers, restaurant waiters and auto drivers, all of whom could speak broken English at least. And since I’m inherently not good at picking up languages, I just didn’t bother.

Before we started out Tamil Nadu trip yesterday my wife (who happens to be good at languages) was wondering how I would fare on this front. “Let’s see how you can put your four years of living in TN to good use”, she said. I told her I hoped to mostly get by with English, and broken Tamil.

After yesterdays lunch she had been impressed. “Not bad. With just words for one and two you managed to manage the conversation”. “Yeah that’s how I managed in chennai”, I replied.

High expectations having thus been set, I’ve had to try and live up to them later on in the trip. My biggest issue is that I end up speaking “assembly language”. I know the words but not the word forms or grammar, and so what I speak can sound funny.

“Instead of asking the shopkeeper what that is, you ended up asking where that is”, my wife informed me yesterday. I had at least got the message across. This kind of faux pas, largely because I can’t speak prepositions and other word forms, continued.

This morning we were at an Adyar ananda Bhavan (a chennai based Tamil Nadu style food chain restaurant) for breakfast. I confidently decided the waiter there might know English and started speaking in English. To our horror, for the rest of the breakfast he spoke to us in Hindi! “If you don’t know Tamil but look Indian you must be Hindi types”, he must have decided.

We tried to talk to him henceforth in our broken Tamil, but he had made his decision. Hindi it was for us.

Then, in the afternoon, at lunch in a “mess” in karaikudi, I was again struggling to speak Tamil with the waitress (to link back to yesterdays post – she was a middle aged woman. Again a cohort I don’t normally see among bangalore waiters). Suddenly I ended up speaking a few Hindi words!

I quickly realised what had happened – Tamil and Hindi are both languages I can’t think in. For both, I “think in Kannada” and translate to the respective language before speaking. And somewhere my wiring had gone wrong today and instead of translating to Tamil I translated to Hindi.

Later on in the conversation she said something quickly. I caught a few words but couldn’t catch the prepositions and ended up entirely misunderstanding her. Apparently she said “the sambar is hot”. And I replied “no, I don’t need hot rice. Pour the sambar on this only”.

And so it’s been going.

Reading Kannada aloud

I’ve never learnt much Kannada formally. Of course, it is the first language, and the language I’ve always spoken at home. However, I’ve not learnt it much formally. While we had it in school as a “third / fourth language”, the focus there was largely functional – that we learnt the language sufficiently to get by in South Bangalore.

The little I remember from the Kannada lessons in school is that we made fun of some words. Basically, the way they were written is very different from the way we spoke them. “adarinda” became “aaddarinda” or even “aadudarinda”. “nintOgatte” became “nintu hOgatte”. Basically, Kannada as a language in which it was written was very different from the way we spoke it.

That said, during those days (early 90s), the only newspaper we got at home was in Kannada, and I learnt to read it fairly well. I still made fun of the “aadudarindas” (and my parents agreed it was weird), but I had figured out how to parse the “written Kannada” as “normal Kannada” and got the information I needed to.

In adulthood, my Kannada reading skills have atrophied, primarily because there isn’t much need to read / write Kannada (apart from the occasional addresses or sign boards). In terms of speaking, Kannada is still my first language, but when it comes to the written text (either reading or writing), English has taken its place.

Recently, my wife has gotten our daughter a few Kannada and Hindi story books, so that she can practice reading the two languages. And last night, before she went to bed, my daughter asked me to read out one of the Kannada books to her.

What I found is that Kannada is a language that is very tough to read aloud, primarily due to the large (in my mind) differences between the way it is written and spoken. I read the sentences out alright, but struggled to make meaning out of it since the words were all formally written.

Soon I gave up and resorted to what I used to do with “Kannada Prabha” or “Vijaya Karnataka” back in the 90s – I would see the words in the formal way but call them out “informally”. So I would see “aadudarinda” in the text, and just read it as “adarinda”. I would read “hOguttade” and say “hOgatte”. Wasn’t easy business, but I managed to read out the whole story.

Nevertheless, Kannada is not a language that is easy to read aloud, because the way it’s written is so different from the way it is spoken. It almost feels like the spoken language has evolved significantly over the years, but the written language hasn’t  kept up. If you have to read silently, you can just substitute the “normal words” for the “formal words” and get on. However, reading aloud, that is not a choice.

In any case, now I’m worried that with my way of reading aloud (speak the words as I would speak them, rather than the way they are written), I’m messing with my daughter’s Kannada reading skills. And having spent two of her first three years in London, Kannada is not even her first language (she basically learnt to talk in her nursery)!

Breaking up sentences in the absence of punctuation

From time to time, a joke goes around that makes the value of punctuation clear. Check out this picture, for example.

Recently, I saw this on my twitter timeline (though here it’s an issue of spacing apart from punctuation).

Someone actually wrote an entire book about the value of punctuation.

In any case, I have a pretty bad track record in terms of reading sentences that don’t have punctuation. I can think of two examples right away.

Firstly, my school diary was filled with quotes from Sri Aurobindo and “The Mother“. By turns, we would have to say “thought for the day” in the school assembly. And we had a reliable way of finding such thoughts – just look in the diary and spout out the nuggets.

One of those went:

Always do what you know to be the best even if it the most difficult thing to do.

Yes, I remember that. Because I had spouted this not once but twice. Now, this is a long sentence without any punctuation. How would you read it?

For the longest time I read it like this.

Always do what you know, to be the best, even if it is the most difficult thing to do. 

So if there are many things that you can do, and you know one of the things, you do that thing even if it is harder than everything else that you might do (but don’t know how to do).

Clearly that doesn’t make that much sense. It was only when I was about to graduate that I figured that it was actually:

Always do what you know to be the best even if it is the most difficult thing to do. 

So there are many things you can do. You know one of them is the “best thing to do”. So even if it is the most difficult thing to do, you do it because you know it is “the best” (not because you “know it the best”).

Another example is from this store near my house. I don’t think the store existed for too long, but it had an interesting and quirky name (in Kannada).

ellAdEvarakrupe stores“, it said. For the longest time I read it as “ellAdEvara  krupe”, or “grace of all gods”. And I thought it was a fascinating name in terms of recognising all religions. And I’ve quoted it many times.

When I was quoting this on Twitter earlier today, I realised that I had got the name of the store all wrong. It’s “ellA dEvarakrupe”, meaning “everything is god’s grace”. It says nothing about which gods are included or excluded, or how many gods there are.

What are your favourite examples of sentences that you’ve misread thanks to the lack of punctuation or other visible sentence markers?

 

Paiyas and Kodakas

Growing up, I found that a lot of my non-Kannadiga friends took great pleasure in using the words “maga” and “magane” (both mean “son”). For a long time I didn’t understand what was so pleasurable in calling someone “son”. Wasn’t it normal in other languages as well (though Tamil prefers Macha (brother-in-law) ) ?

It took two incidents, separated by six years (and the latter of the two happened ten years ago), for me to understand this. It had to do with abuses.

I remember visiting a Tamilian friend at her home sometime in 2004. There were a few other friends there, and everyone who was there except me was Tamilian (and this is a 20 year-old problem – people randomly assume I’m Tamilian and speak to me in Tamil). So the host’s mother, in the course of the conversation, would break off into Tamil, and when the discussion was about some boys, would talk about “this paiyya” or “that paiyya”.

I remember trying to suppress a chuckle every time she said “paiyya” (I’ll come to the reason in a bit), but largely managed to keep a straight face through the conversation.

Six years later I was visiting my then-girlfriend, now-wife. Pinky’s mother is Gult (technically her father is also Gult, but his ancestors came to Karnataka so long ago that for all practical purposes they’re dig). On the day I visited, Pinky’s aunt was also visiting, and Pinky’s mother and aunt were talking (in Gult) about some boys. And they kept referring to these boys as “koDaku”.

Again I had to suppress chuckles, for the same reason I had suppressed chuckles when my friend’s mother kept saying “paiyya” six years before. And at the same time I understood why my non-Kannadiga friends took such pleasure in saying “magane”. It has to do with abuses.

When you learn a new language as a teenager, it is fairly standard to start off by first learning the swearwords in that language. For some strange reason, South Indians revel in abusing one another’s mothers. And so the popular abuses in all South Indian languages follow this template.

In Kannada, you have “bOLi magane” (son of a bitch) and “sULe magane” (son of a prostitute). Tamil has “thEvaDiya paiyya” (son of a prostitute again). Telugu has “lanja koDaka” (son of a prostitute, once again) and, rather fascinatingly for the amateur anthropologist, “donganA koDaka” (son of a thief).

And in Telugu and Tamil, the word for “boy” is also used interchangeably for “son”, and it’s the same word that appears in the above swear-phrases (Kannada is a little bit different – the word for “boy” is used for “son”, but the swearwords all have the word that is exclusively used for “son”).

Now you know where this is going.

In normal teenage or college conversation it’s not common to talk about people’s sons. So if you’re a Kannadiga who’s only learnt swearwords in Telugu or Tamil, you would have heard the words “koDaka” and “paiyya” in only that context. You would have never heard these words in isolation in normal conversation, separated from the prefixes that make them the swearing qualities.

So because “thevaDiya paiyya” is a swearphrase, I had assumed that both words in it are independently swearwords. And so I got shocked that my friend’s mother kept casually saying “paiyya” in the course of normal conversation, and my (extremely paavam/sadhu) friends didn’t flinch.

It is the same with “koDaka” – having appeared in TWO swearphrases I knew, I assumed it was a swearword, and was shocked to see my would-be mother-in-law use it in a casual conversation with her sister.

I imagine it is the same with “magane” – for non-Kannadigas for whom it’s just part of a swearphrase, it is effectively a swearword. And so, when they use the word, it’s as if they are swearing. And that explains their glee in uttering the word.

Kannada has another son-based swearword. “baDDi maga”, which translates to “son of interest” (as in the interest you pay on a loan). I’ve never understood the logic behind that one.

Default Acronym Expansions

Based on the kind of stuff we are interested in, each of us has our own “default expansions” for acronyms.

Now, there are only 26 letters in the English alphabet (and some are much more common than others), and a good acronym is 2-4 letters long, so there are so many acronyms going around. So it is inevitable that there is acronym overloading, with the same acronym meaning different things in different contexts.

In this context, whenever we see an acronym, we have a default expansion of it based on our interests and domains and exposures. And this can lead to some hilarious interpretations at times.

I read this newsletter called “Margins“. I don’t agree with everything they write, but they write about interesting stuff so I read them. Yesterday’s edition had this gem:

Clearly, the 2008 Financial Crisis and the blowup of CDOs and MBSs left a bad taste in people’s mouths over the chopping up and passing off of debt (note: I now get uncomfortable every time I write “MBS” and “chopped up” in a sentence).

This joke works only because of acronym overloading. MBS also refers to Saudi crown prince Mohammed Bin Salman, and he “allegedly” got dissident journalist Jamal Khashoggi, who worked for the Washington Post, literally chopped up (for those of you for whom Mohammed Bin Salman is the default MBS, it can also refer to Mortgage Backed Securities).

Long ago, I worked for a company that had launched a product acronymised as “LFM”. I could never understand what this product does because my “default expansion” of LFM is Left Arm Fast Medium.

Acronym confusion can also happen when you’re deeply familiar with one domain with its own set of jargons and acronyms, and then are suddenly exposed to another domain that has its own set of jargons and acronyms. It takes a long time to “unlearn” your old acronyms and then learn the new ones.

Then again, given the limited number of acronyms available, sooner or later we better learn to learn and unlearn new meanings of acronyms.

Maybe one day Kohlberg Kravis Roberts will buy Kolkata Knight Riders
I still don’t understand how the IPL allowed Delhi Capitals since there used to exist a team called Deccan Chargers in the same league
Does your All India Rank get announced on All India Radio?

Giant Squid is Good Stuff.

 

The difficulty of song translation

One of my wife’s favourite nursery rhymes is this song that is sung to the tune of “for he’s a jolly good fellow”, and about a bear going up a mountain.

For a long time I only knew of the Kannada version of this song (which is what the wife used to sing), but a year or two back, I found the “original” English version as well.

And that was a revelation, for the lyrics in the English version make a lot more sense. They go:

The bear went over the mountain;
The bear went over the mountain
The bear went over the mountain, to see what he could see.
And all that he could see, and all that he could see
Was the other side of the mountain,
The other side of the mountain
The other side of the mountain, was all that he could see.

Now, the Kannada version, sung to the same tune, obviously goes “???? ??????? ??????” (karaDi beTTakke hoithu). That part has been well translated. However, the entire stanza hasn’t been translated properly, because of which the song goes a bit meaningless.

The lyrics, when compared to the original English version, are rather tame. Since a large part of my readership don’t understand Kannada, here is my translation of the lyrics (btw, the lyrics used in these YouTube versions are different from the lyrics that my wife sings, but both are similar):

The bear went to the mountain.
The bear went to the mountain.
The bear went to the mountain.
To see the scenery

And what did it see?
What did it see?
The other side of the mountain.
The other side of the mountain.
It saw the scenery of the other side of the mountain.

Now, notice the important difference in the two versions, which massively changes the nature of the song. The Kannada version simply skips the “all that he could see” part, which I think is critical to the story.

The English version, in a way, makes fun of the bear, talking about how it went over the mountain thinking it’s a massive task, but “all that he could see” from there was merely the other side of the mountain. This particular element is missing in Kannada – there is nothing in the lyrics that suggests that the bear’s effort to climb the mountain was a bit of a damp squib.

And that,  I think, is due to the difficulty of translating songs. When you translate a song, you need to get the same letter and spirit of the lyrics, while making sure they can follow the already-set music as well (and even get the rhyming right). And unless highly skilled bilingual poets are involved, this kind of a translation is really difficult.

So you get half-baked translations, like the bear story, which possibly captures the content of the story but completely ignores its spirit.

After I had listened to the original English version, I’ve stopped listening to the Kannada version of the bear-mountain song. Except when the wife sings it, of course.

 

Gults and Grammar

Back in IIT, it was common to make fun of people from Andhra Pradesh for their poor command over the English language. It was a consequence of the fact that JEE coaching is far more institutionalised in that (undivided) state, because of which people come to IIT from less privileged backgrounds (on average) than their counterparts in Karnataka or Tamil Nadu or Maharashtra.

Now, in hindsight, making fun of people’s English doesn’t sound particularly nice, but sometimes stories come up that make it incredibly hard to resist.

This one is from Matt Levine’s newsletter. And it is about an insider trading ring. This is a quote that Levine has quoted in his newsletter (pay attention to the names):

According to the SEC’s complaint, Janardhan Nellore, a former IT administrator then at Palo Alto Networks Inc., was at the center of the trading ring, using his IT credentials and work contacts to obtain highly confidential information about his employer’s quarterly earnings and financial performance. As alleged in the complaint, until he was terminated earlier this year, Nellore traded Palo Alto Networks securities based on the confidential information or tipped his friends, Sivannarayana Barama, Ganapathi Kunadharaju, Saber Hussain, and Prasad Malempati, who also traded.

The SEC’s complaint alleges that the defendants sought to evade detection, with Nellore insisting that the ring use the code word “baby” in texts and emails to refer to his employer’s stock, and advising they “exit baby,” or “enter few baby.” The complaint also alleges that certain traders kicked back trading profits to Nellore in small cash transactions intended to avoid bank scrutiny and reporting requirements. After the FBI interviewed Nellore about the trading in May, he purchased one-way tickets to India for himself and his family and was arrested at the airport.

You can look at Levine’s newsletter to understand his take on the story (it’s towards the bottom), but what catches my eye is the grammar. I think it is all fine to refer to the insider-traded stock as a “baby”, but at least be grammatically correct about it!

“Enter few baby” is so obviously grammatically incorrect (it’s hard to even be a typo) that when intercepted by someone like the SEC, it would immediately send alarm bells ringing. Which is what I suppose possibly happened.

So my take on this case is – don’t insider  trade, but even if you do, be grammatical about your signals. If you’re so obviously grammatically wrong, it is easy for whoever intercepts your chats to know you’re up to something fishy.

But then if you’re gult..

YG Rao

We’re celebrating Ganesha Chaturthi by re-watching Ganeshana Maduve and Gowri Ganesha, two classic movies from the early 1990s starring Anant Nag and Vinaya Prasad.

Ganeshana Maduve is a shop-around-the-corner / you’ve-got-mail kind of story of real-life neighbours who hate each other who court each other through letters. Real-life Adilakshmi has adopted the name “Shruti” for her singing career, and she replies to her fan-mail under the same name.

It is her fan/neighbour’s name that had intrigued me thus far. He is the titular Ganesha, but saying that “Ganesha” sounds too old-fashioned, he writes his letters under the name “Y G Rao”, short for his full name which is “Y Ganesh Rao”.

Now, this would have been fine, except that later on in the movie his father’s name is shown to be Govinda. And under conventional Kannada naming conventions, this simply doesn’t make sense. Typically in most Kannada names, if you have only one “initial” that represents your father’s given name (for example, the S in my name stands for Shashidhar, which is my father’s given name).

Hence, under standard Kannada naming conventions, Govinda’s son has to be G Ganesh Rao. And in what is an overall excellent movie (it’s easily my most-watched movie of all time. Today was perhaps the 50th time I watched it), this naming convention was a bit intriguing.

The thing with Ganeshana Maduve is that each time you watch it, you discover a layer that you hadn’t discovered  (or missed) earlier. And one detail I found today that I’d missed earlier, is that the movie is based on a Telugu novel. And then it all started making sense.

It is perfectly okay under Telugu naming convention for Govinda’s son to be Y Ganesh Rao, for a single initial there represents the family name, rather than the father’s given name.

And so it is very likely that when the Telugu novel was adapted into a Kannada film, the names were kept the same, and so we got the Telugu convention into the Kannada movie!

The next item on today’s festival agenda was to watch Gowri Ganesha, but I need to get some work done, so the wife is watching that alone. And while some process runs I’m writing this post.

Good vodka and bad chicken

When I studied Artificial Intelligence, back in 2002, neural networks weren’t a thing. The limited compute capacity and storage available at that point in time meant that most artificial intelligence consisted of what is called “rule based methods”.

And as part of the course we learnt about machine translation, and the difficulty of getting the implicit meaning across. The favourite example by computer scientists in that time was the story of how some scientists translated “the spirit is willing but the flesh is weak” into Russian using an English-Russian translation software, and then converted it back into English using a Russian-English translation software.

The result was “the vodka is excellent but the chicken is not good”.

While this joke may not be valid any more thanks to the advances in machine translation, aided by big data and neural networks, the issue of translation is useful in other contexts.

Firstly, speaking in a language that is not your “technical first language” makes you eschew jargon. If you have been struggling to get rid of jargon from your professional vocabulary, one way to get around it is to speak more in your native language (which, if you’re Indian, is unlikely to be your technical first language). Devoid of the idioms and acronyms that you normally fill your official conversation with, you are forced to think, and this practice of talking technical stuff in a non-usual language will help you cut your jargon.

There is another use case for using non-standard languages – dealing with extremely verbose prose. A number of commentators, a large number of whom are rather well-reputed, have this habit of filling their columns with flowery language, GRE words, repetition and rhetoric. While there is usually some useful content in these columns, it gets lost in the language and idioms and other things that would make the columnist’s high school English teacher happy.

I suggest that these columns be given the spirit-flesh treatment. Translate them into a non-English language, get rid of redundancies in sentences and then  translate them back into English. This process, if the translators are good at producing simple language, will remove the bluster and make the column much more readable.

Speaking in a non-standard language can also make you get out of your comfort zone and think harder. Earlier this week, I spent two hours recording a podcast in Hindi on cricket analytics. My Hindi is so bad that I usually think in Kannada or English and then translate the sentence “live” in my head. And as you can hear, I sometimes struggle for words. Anyway here is the thing. Listen to this if you can bear to hear my Hindi for over an hour.