Just Plot It

One of my favourite work stories is from this job I did a long time ago. The task given to me was demand forecasting, and the variable I needed to forecast was so “micro” (this intersection that intersection the other) that forecasting was an absolute nightmare.

A side effect of this has been that I find it impossible to believe that it’s possible to forecast anything at all. Several (reasonably successful) forecasting assignments later, I still dread it when the client tells me that the project in question involves forecasting.

Another side effect is that the utter failure of standard textbook methods in that monster forecasting exercise all those years ago means that I find it impossible to believe that textbook methods work with “real life data”. Textbooks and college assignments are filled with problems that when “twisted” in a particular way easily unravel, like a well-tied tie knot. Industry data and problems are never as clean, and elegance doesn’t always work.

Anyway, coming back to the problem at hand, I had struggled for several months with this monster forecasting problem. Most of this time, I had been using one programming language that everyone else in the company used. The code was simultaneously being applied to lots of different sub-problems, so through the months of struggle I had never bothered to really “look at” the data.

I must have told this story before, when I spoke about why “data scientists” should learn MS Excel. For what I did next was to load the data onto a spreadsheet and start looking at it. And “looking at it” involved graphing it. And the solution, or the lack of it, lay right before my eyes. The data was so damn random that it was a wonder that anything had been forecast at all.

It was also a wonder that the people who had built the larger model (into which my forecasting piece was to plug in) had assumed that this data would be forecast-able at all (I mentioned this to the people who had built the model, and we’ll leave that story for another occasion).

In any case, looking at the data, by putting it in a visualisation, completely changed my perspective on how the problem needed to be tackled. And this has been a learning I haven’t let go of since – the first thing I do when presented with data is to graph it out, and visually inspect it. Any statistics (and any forecasting for sure) comes after that.

Yet, I find that a lot of people simply fail to appreciate the benefits of graphing. That it is not intuitive to do with most programming languages doesn’t help. Incredibly, even Python, a favoured tool of a lot of “data scientists”, doesn’t make graphing easy. Last year when I was forced to use it, I found that it was virtually impossible to create a PDF with lots of graphs – something that I do as a matter of routine when working on R (I subsequently figured out a (rather inelegant) hack the next time I was forced to use Python).

Maybe when you work on data that doesn’t have meaningful variables – such as images, for example – graphing doesn’t help (since a variable on its own has little information). But when the data remotely has some meaning – sales or production or clicks or words, graphing can be of immense help, and can give you massive insight on how to develop your model!

So go ahead, and plot it. And I won’t mind if you fail to thank me later!

Ticking all the boxes

Last month my Kindle gave up. It refused to take charge, only heating up the  charging cable (and possibly destroying an Android charger) in the process. This wasn’t the first time this was happening.

In 2012, my first Kindle had given up a few months after I started using it, with its home button refusing to work. Amazon had sent me a new one then (I’d been amazed at the no-questions-asked customer-centric replacement process). My second Kindle (the replacement) developed problems in 2016, which I made worse by trying to pry it open with a knife. After I had sufficiently damaged it, there was no way I could ask Amazon to do anything about it.

Over the last year, I’ve discovered that I read much faster on my Kindle than in print – possibly because it allows me to read in the dark, it’s easy to hold, I can read without distractions (unlike phone/iPad) and it’s easy on the eye. I possibly take half the time to read on a Kindle what I take to read in print. Moreover, I find the note-taking and highlighting feature invaluable (I never made a habit of taking notes on physical books).

So when the kindle stopped working I started wondering if I might have to go back to print books (there was no way I would invest in a new Kindle). Customer care confirmed that my Kindle was out of warranty, and after putting me on hold for a long time, gave me two options. I could either take a voucher that would give me 15% off on a new Kindle, or the customer care executive could “talk to the software engineers” to see if they could send me a replacement (but there was no guarantee).

Since I had no plans of buying a new Kindle, I decided to take a chance. The customer care executive told me he would get back to me “within 24 hours”. It took barely an hour for him to call me back, and a replacement was in my hands in 2 days.

It got me wondering what “software engineers” had to do with the decision to give me a replacement (refurbished) Kindle. Shortly I realised that Amazon possibly has an algorithm to determine whether to give a replacement Kindle for those that have gone kaput out of warranty. I started  trying to guess what such an algorithm might look like.

The interesting thing is that among all the factors that I could list out based on which Amazon might make a decision to send me a new Kindle, there was not one that would suggest that I shouldn’t be given a replacement. In no particular order:

  • I have been an Amazon Prime customer for three years now
  • I buy a lot of books on the Kindle store. I suspect I’ve purchased books worth more than the cost of the Kindle in the last year.
  • I read heavily on the Kindle
  • I don’t read Kindle books on other apps (phone / iPad / computer)
  • I haven’t bought too many print books from Amazon. Most of the print books I’ve bought have been gifts (I’ve got them wrapped)
  • My Goodreads activity suggests that I don’t read much outside of what I’ve bought from the Kindle store

In hindsight, I guess I made the correct decision of letting the “software engineers” determine whether I qualify for a new Kindle. I guess Amazon figured that had they not sent me a new Kindle, there was a significant amount of low-marginal-cost sales that they were going to lose!

I duly rewarded them with two book purchases on the Kindle store in the course of the following week!

Voice assistants and traditional retail

Traditionally, retail was an over-the-counter activity. There was a physical counter between the buyer and the seller, and the buyer would demand what he wanted, and the shopkeeper would hand it over to him. This form of retail gave greater power to the shopkeeper, which meant that brands could practice what can be described as “push marketing”.

Most of the marketing effort would be spent in selling to the shopkeeper and then providing him sufficient incentives to sell it on to the customer. In most cases the customer didn’t have that much of a choice. She would ask for “salt”, for example, and the shopkeeper would give her the brand of salt that benefited him the most to sell.

Sometimes some brands would provide sufficient incentives to the shopkeeper to ensure that similar products from competing brands wouldn’t be stocked at all, ensuring that the customer faced a higher cost of getting those products (going to another shops) if they desired it. Occasionally, such strategies would backfire (a client with extremely strong brand preferences would eschew the shopkeeper who wouldn’t stock these brands). Mostly they worked.

The invention of the supermarket (sometime in the late 1800s, if I remember my research for my book correctly – it followed the concept of set prices) changed the dynamics a little bit. In this scenario, while the retailer continues to do the “shortlisting”, the ultimate decision is in the hands of the customer, who will pick her favourite among the brands on display.

This increases the significance of branding in the minds of the customer. The strongest incentives to retailers won’t work (unless they result in competing brands being wiped out from the shelves – but that comes with a risk) if the customer has a preference for a competing product. At best the retailer can offer these higher-incentive brands better shelf space (eye level as opposed to ankle level, for example).

However, even in traditional over-the-counter retail, branding matters to an extent when there is choice (as I had detailed in an earlier post written several years ago). This is in the instance where the shopkeeper asks the customer which brand she wants, and the customer has to make the choice “blind” without knowing what exactly is available.

I’m reminded of this issue of branding and traditional retail as I try to navigate the Alexa voice assistant. Nowadays there are two ways in which I play music using Spotify – one is the “direct method” from the phone or computer, where I search for a song, a list gets thrown up and I can select which one to play. The other is through Alexa, where I ask for a song and the assistant immediately starts playing it.

With popular songs where there exists a dominant version, using the phone and Alexa give identical results (though there are exceptions to this as well – when I ask Alexa to play Black Sabbath’s Iron Man, it plays the live version which is a bit off). However, when you are looking for songs that have multiple interpretations, you implicitly let Alexa make the decision for you, like a shopkeeper in traditional retail.

So, for example, most popular nursery rhymes have been covered by several groups. Some do the job well, singing the rhymes in the most dominant tunes, and using the most popular versions of the lyrics. Other mangle the tunes, and even the lyrics (like this Indian YouTube channel called Chuchu TV has changed the story of Jack and Jill, to give a “moral” to the story. I’m sure as a teenager you had changed the lyrics of Jack and Jill as well :P).

And in this situation you want more control over which version is played. For most songs I prefer the Little Baby Bum version, while for some others I prefer the Nursery Rhymes 123 version, but there is no “rule”. And this makes it complicate to order songs via Alexa.

More importantly, if you are a music publisher, the usage of Alexa to play on Spotify means that you might be willing to give Spotify greater incentives so that your version of a song comes up on top when a user searches for it.

And when you factor in advertising and concepts such as “paid search” into the picture, the fact that the voice assistants dictate your choices makes the situation very complicated indeed.

I wonder if there’s a good solution to this problem.

I’m not a data scientist

After a little over four years of trying to ride a buzzword wave, I hereby formally cease to call myself a data scientist. There are some ongoing assignments where that term is used to refer to me, and that usage will continue, but going forward I’m not marketing myself as a “data scientist”, and will not use the phrase “data science” to describe my work.

The basic problem is that over time the term has come to mean something rather specific, and that doesn’t represent me and what I do at all. So why did I go through this long journey of calling myself a “data scientist”, trying to fit in in the “data science community” and now exiting?

It all started with a need to easily describe what I do.

To recall, my last proper full-time job was as a Quant at a leading investment bank, when I got this idea that rather than building obscure models for trading obscure corner cases, I might as well use use my model-building skills to solve “real problems” in other industries which were back then not as well served by quants.

So I started calling myself a “Quant consultant”, except that nobody really knew what “quant” meant. I got variously described as a “technologist” and a “statistician” and “data monkey” and what not, none of which really captured what I was actually doing – using data and building models to help companies improve their businesses.

And then “data science” happened. I forget where I first came across this term, but I had been primed for it by reading Hal Varian saying that the “sexiest job in the next ten years will be statisticians”. I must mention that I had never come across the original post by DJ Patil and Thomas Davenport (that introduces the term) until I looked for it for my newsletter last year.

All I saw was “data” and “science”. I used data in my work, and I tried to bring science into the way my clients thought. And by 2014, Data Science had started becoming a thing. And I decided to ride the wave.

Now, data science has always been what artificial intelligence pioneer Marvin Minsky called a “suitcase term” – words or phrases that mean different things to different people (I heard about the concept first from this brilliant article on the “seven deadly sins of AI predictions“).

For some people, as long as some data is involved, and you do something remotely scientific it is data science. For others, it is about the use of sophisticated methods on data in order to extract insights. Some others conflate data science with statistics. For some others, only “machine learning” (another suitcase term!) is data science. And in the job market, “data scientist” can sometimes be interpreted as “glorified Python programmer”.

And right from inception, there were the data science jokes, like this one:

It is pertinent to put a whole list of it here.

‘Data Scientist’ is a Data Analyst who lives in California”
“A data scientist is someone who is better at statistics than any software engineer and better at software engineering than any statistician.”
“A data scientist is a business analyst who lives in New York.”
“A data scientist is a statistician who lives in San Francisco.”
“Data Science is statistics on a Mac.”

I loved these jokes, and thought I had found this term that had rather accurately described me. Except that it didn’t.

The thing with suitcase terms is that they evolve over time, as they start getting used differentially in different contexts. And so it was with data science. Over time, it has been used in a dominant fashion by people who mean it in the “machine learning” sense of the term. In fact, in most circles, the defining features of data scientists is the ability to write code in python, and to use the scikit learn package – neither of which is my distinguishing feature.

While this dissociation with the phrase “data science” has been coming for a long time (especially after my disastrous experience in the London job market in 2017), the final triggers I guess were a series of posts I wrote on LinkedIn in August/September this year.

The good thing about writing is that it helps you clarify your mind, and as I ranted about what I think data science should be, I realised over time that what I have in mind as “data science” is very different from what the broad market has in mind as “data science”. As per the market definition, just doing science with data isn’t data science any more – instead it is defined rather narrowly as a part of the software engineering stack where problems are solved based on building machine learning models that take data as input.

So it is prudent that I stop using the phrase “data science” and “data scientist” to describe myself and the work that I do.

PS: My newsletter will continue to be called “the art of data science”. The name gets “grandfathered” along with other ongoing assignments where I use the term “data science”.

Triangle marketing

This blog post is based more on how I have bought rather than how I have sold. The basic concept is that when you hear about a product or service from two or more independent sources, you are more likely to buy it.

The threshold varies by the kind of product you are looking at. When it is a low touch item like a book, two independent recommendations are enough. When it involves higher cost and has higher impact, like a phone, it might be five recommendations. For something life changing like a keto diet, it might be ten (I must mention I tried keto for half a day and gave up, not least because I figured I don’t really need it – I’m barely 3-4 kg overweight).

The important point to note is that the recommendations need to come from independent sources – if two people who you didn’t expect to have a similar taste in books were to recommend the same book, the second of these recommendations is likely to create an “aha moment” (ok I’m getting into consultant-speak now), and that is likely to drive a purchase (or at least trying a Kindle sample).

In some ways, exposure to the same product through independent sources is likely to create a feeling of a self-fulfilling prophecy. “Alice is also using this. Bob is also using this” will soon go into “everybody seems to be using it. I should also use it”.

So what does this mean to you if you are a seller? Basically you need to hit your target audience through various channels. I had mentioned in my post earlier this week about how branding creates a “position of strength“, and how direct sales is normally hard because it is done through a position of weakness.

The idea is that before you hit your audience with a direct sale, you need to “warm them up” with your brand, and you need to do this through various channels. Your brand needs to impact on your audience through multiple independent channels, so that it has become a self-fulfilling prophecy before you approach to make the sale.

What these precise channels are depends on your business and the product that you’re trying to sell, but the important thing is that they are independent. So for example, putting advertisements in various places won’t help since the target will treat all of them as coming from the same source.

Finally, where is the “triangle” in this marketing? It is in the idea that you complete the branding and sales by means of “triangulation”. You send out vectors in seemingly random directions trying to build your brand, and they will get reflected till a time when they intersect, or “triangulate”. Ok I know my maths here is messy ant not up to my usual standard, but I guess you know what I’m getting at!

 

The advantage of recurring payments

One of the best things about payments in the UK is the ubiquity of the direct debit system. From gym memberships to contact lenses to television licenses, all sorts of subscriptions are sold on a direct debit based model.

The mechanism is simple – the merchant, with the consent of the customer, sets up a direct debit system with the customer’s account such that a specified amount is debited periodically. This direct debit system can be cancelled at the customer’s discretion, resulting in automatic annulment of the subscription.

This is a great business model because it allows businesses to acquire customers for a repeated transaction, without the latter having to commit for too long a period.

The key feature of the direct debit system is the customer opt out. That the account will be continued by default means that it takes explicit action by the user to terminate the subscription, which helps the business acquire customers with the cost amortised over several time periods. The any time opt-out feature (which the user can do at her bank’s website or app, without consent of the merchant) means that the commitment at any time for the customer is for one period only, making the product an easier sell.

In the absence of the recurring payment based model, the business will either have to offer short term “subscriptions”, which implies a customer acquisition cost at each period, or long term contracts, which takes a higher upfront commitment from the customer making it a much harder sell.

In that sense, a recurring payment model offers a nice middle ground, resulting in value being unlocked for both the business and the customer, resulting in enhanced welfare all around.

In that sense, the lack of a recurring payments system is a key shortcoming of the payments scene in India. While it was possible to do this earlier, current rules by the Reserve Bank of India require authorisation by the customer (in the form of two factor authentication) for every transaction, making them opt-in rather than opt-out (the opt-out feature is key to amortise customer acquisition cost).

The updated version of the unified payments interface (UPI 2.0) was supposed to offer this recurring feature, but media reports say that the update is being rolled out without this feature. That is surely an opportunity missed for India’s businesses to grow.

Branding and positions of strength

I had an invitation to attend a data science networking event today. I had accepted the free pass for option value, but decided today to not exercise the option. Given I was not going to speak at the event, I realised that the value of the conversations at the event for me would be limited.

One of the internet gurus (it might be Naval Ravikant, but I’m unable to locate the source) has this principle that you shouldn’t go to networking events unless you’re speaking. Now, if everyone applied this principle events would look very different, with speakers speaking to one another (like in NED Talks!).

Thinking about it, though, I see clear value in this maxim. Basically when you go to a networking event and speak, you can network from a position of strength, especially after you’ve spoken. This is assuming you’ve done a good job of your speech, of course, but apart from elevating your status as a “speaker”, speaking at the event allows potential counterparties in conversations to have prior information about you before they talk to you.

So there is context in the conversation, and since you know they know something about you, you can speak from a position of strength, and hopefully make a greater impact.

It is not just about speaking and events. For a long time, a lot of my consulting business came from readers of this blog (yes, really!). This was because these people had been reading me, and knew me, and so when I spoke to them, there was already a “prior” on which I could base my sale. Of late, I’ve been putting out a lot of work-related content here and on LinkedIn, and that has sparked several conversations, which I have been able to navigate from a position of strength.

A possibly simpler word to describe this is “branding”. By speaking at an event or putting out content or indulging in other activities that let people know about you and what you do, you are building a brand. And then when the conversation happens, the brand you have thus built puts you in a position of strength which makes the sale far easier than if you didn’t have the brand.

You need to remember that position of strength as I’ve described here is not relative. It is not always necessary for the brand to elevate you to a level higher than the counterparty. All that is necessary is for it to put you at a high enough level that you don’t need to talk from a position of weakness. And if you think about it, cold calling and door to door sales is basically selling from a position of weakness – while it might have worked occasionally (which makes for fantastic stories), it is on the most part not successful.

And in some way, this concept of branding and positions of strength is well correlated to what I recently described as “the secret of my happiness“. By being really good at what you are good at, you are essentially putting yourself in a position of strength, so that people have no choice but to tolerate your inadequacies in other areas. Putting it another way, being really good at what you are good at is another exercise in brand building!

Brand building efforts can sometimes fail. There are times when I have given talks and got few questions – clearly indicating it was a wasted talk (either I didn’t talk well, or the audience didn’t get it). I have put out content that has just sank without a trace or any feedback. The important thing to know is that somewhere it all adds up – that these small efforts in branding can come together at some point in time, and make it work for you.