Beer and diapers: Netflix edition

When we started using Netflix last May, we created three personas for the three of us in the family – “Karthik”, “Priyanka” and “Berry”. At that time we didn’t realise that there was already a pre-created “kids” (subsequently renamed “children” – don’t know why that happened) persona there.

So while Priyanka and I mostly use our respective personas to consume Netflix (our interests in terms of video content hardly intersect), Berry uses both her profile and the kids profile for her stuff (of course, she’s too young to put it on herself. We do it for her). So over the year, the “Berry” profile has been mostly used to play Peppa Pig, and the occasional wildlife documentary.

Which is why we were shocked the other day to find that “Real life wife swap” had been recommended on her account. Yes, you read that right. We muttered a word of abuse about Netflix’s machine learning algorithms and since then have only used the “kids” profile to play Berry’s stuff.

Since then I’ve been wondering what made Netflix recommend “real life wife swap” to Berry. Surely, it would have been clear to Netflix that while it wasn’t officially classified as one, the Berry persona was a kid’s account? And even if it didn’t, didn’t the fact that the account was used for watching kids’ stuff lead the collaborative filtering algorithms at Netflix to recommend more kids’ stuff? I’ve come up with various hypotheses.

Since I’m not Netflix, and I don’t have their data, I can’t test it, but my favourite hypothesis so far involves what is possibly the most commonly cited example in retail analytics – “beer and diapers“. In this most-likely-apocryphal story, a supermarket chain discovered that beer and diapers were highly likely to appear together in shopping baskets. Correlation led to causation and a hypothesis was made that this was the result of tired fathers buying beer on their diaper shopping trips.

So the Netflix version of beer-and-diapers, which is my hypothesis, goes like this. Harrowed parents are pestered by their kids to play Peppa Pig and other kiddie stuff. The parents are so stressed that they don’t switch to the kid’s persona, and instead play Peppa Pig or whatever from their own accounts. The kid is happy and soon goes to bed. And then the parent decides to unwind by watching some raunchy stuff like “real life wife swap”.

Repeat this story in enough families, and you have a strong enough pattern that accounts not explicitly classified as “kids/children” have strong activity of both kiddie stuff and adult content. And when you use an account not explicitly mentioned as “kids” to watch kiddie stuff, it gets matched to these accounts that have created the pattern – Netflix effectively assumes that watching kid stuff on an adult account indicates that the same account is used to watch adult content as well. And so serves it to Berry!

Machine learning algorithms basically work on identifying patterns in data, and then fitting these patterns on hitherto unseen data. Sometimes the patterns make sense – like Google Photos identifying you even in your kiddie pics. Other times, the patterns are offensive – like the time Google Photos classified a black woman as a “gorilla“.

Thus what is necessary is some level of human oversight, to make sure that the patterns the machine has identified makes some sort of sense (machine learning purists say this is against the spirit of machine learning, since one of the purposes of machine learning is to discover patterns not perceptible to humans).

That kind of oversight at Netflix would have suggested that you can’t tag a profile to a “kiddie content AND adult content” category if the profile has been used to watch ONLY kiddie content (or ONLY adult content). And that kind of oversight would have also led Netflix to investigate issues of users using “general” account for their kids, and coming up with an algorithm to classify such accounts as kids’ accounts, and serve only kids’ content there.

It seems, though, that algorithms run supreme at Netflix, and so my baby daughter gets served “real life wife swap”. Again, this is all a hypothesis (real life wife swap being recommended is a fact, of course)!

More on interactive graphics

So for a while now I’ve been building this cricket visualisation thingy. Basically it’s what I think is a pseudo-innovative way of describing a cricket match, by showing how the game ebbs and flows, and marking off the key events.

Here’s a sample, from the ongoing game between Chennai Super Kings and Kolkata Knight Riders.

As you might appreciate, this is a bit cluttered. One “brilliant” idea I had to declutter this was to create an interactive version, using Plotly and D3.js. It’s the same graphic, but instead of all those annotations appearing, they’ll appear when you hover on those boxes (the boxes are still there). Also, when you hover over the line you can see the score and what happened on that ball.

When I came up with this version two weeks back, I sent it to a few friends. Nobody responded. I checked back with them a few days later. Nobody had seen it. They’d all opened it on their mobile devices, and interactive graphics are ill-defined for mobile!

Because on mobile there’s no concept of “hover”. Even “click” is badly defined because fingers are much fatter than mouse pointers.

And nowadays everyone uses mobile – even in corporate settings. People who spend most time in meetings only have access to their phones while in there, and consume all their information through that.

Yet, you have visualisation “experts” who insist on the joys of tools such as Tableau, or other things that produce nice-looking interactive graphics. People go ga-ga over motion charts (they’re slightly better in that they can communicate more without input from the user).

In my opinion, the lack of use on mobile is the last nail in the coffin of interactive graphics. It is not like they didn’t have their problems already – the biggest problem for me is that it takes too much effort on the part of the user to understand the message that is being sent out. Interactive graphics are also harder to do well, since the users might use them in ways not intended – hovering and clicking on the “wrong” places, making it harder to communicate the message you want to communicate.

As a visualiser, one thing I’m particular about is being in control of the message. As a rule, a good visualisation contains one overarching message, and a good visualisation is one in which the user gets the message as soon as she sees the chart. And in an interactive chart which the user has to control, there is no way for the designer to control the message!

Hopefully this difficulty with seeing interactive charts on mobile will mean that my clients will start demanding them less (at least that’s the direction in which I’ve been educating them all along!). “Controlling the narrative” and “too much work for consumer” might seem like esoteric problems with something, but “can’t be consumed on mobile” is surely a winning argument!

 

 

FaceTime Baby

My nephew Samvit, born in 2011, doesn’t talk much on the phone. It’s possibly because he didn’t talk much on the phone as a baby, but I’ve never been able to have a decent phone conversation with him (we get along really well when we meet, though). He talks a couple of lines and hands over the phone to his mother and runs off. If it’s a video call, he appears, says hi and disappears.

Berry (born in 2016), on the other hand, seems to have in a way “leapfrogged” the phone. We moved to London when she was five and a half months old, and since then we’ve kept in touch with my in-laws and other relatives primarily through video chat (FaceTime etc.). And so Berry has gotten used to seeing all these people on video, and has become extremely comfortable with the medium.

For example, when we were returning from our last Bangalore trip in December, we were worried that Berry would miss her grandparents tremendously. As it turned out, we landed in London and video called my in-laws, and Berry was babbling away as if there was no change in scene!

Berry has gotten so used to video calling that she doesn’t seem to get the “normal” voice call. Sure enough, she loves picking up the phone and holding it against her ear and saying “hello” and making pretend conversations (apparently she learnt this at her day care). But give her a phone and ask her to talk, and she goes quiet unless there’s another person appearing on screen.

Like there’s this one aunt of mine who is so tech-phobic that she doesn’t use video calls. And every time I call her she wants to hear Berry speak, except that Berry won’t speak because there is nobody on the screen! I’m now trying to figure out how to get this aunt to get comfortable with video calling just so that Berry can talk to her!

 

In that sense, Berry is a “video call” native. And I wouldn’t be surprised if it turns out that she’ll find it really hard to get comfortable with audio calls later on in life.

I’ll turn into one uncle now and say “kids nowadays… “

More issues with Slack

A long time back I’d written about how Slack in some ways was like the old DBabble messaging and discussion group platform, except for one small difference – Slack didn’t have threaded conversations which meant that it was only possible to hold one thread of thought in a channel, significantly limiting discussion.

Since then, Slack has introduced threaded conversations, but done it in an atrocious manner. The same linear feed in each channel remains, but there’s now a way to reply to specific messages. However, even in this little implementation Slack has done worse than even WhatsApp – by default, unless you check one little checkbox, your reply will only be sent to the person who originally posted the message, and doesn’t really post the message on the group.

And if you click the checkbox, the message is displayed in the feed, but in a rather ungainly manner. And threads are only one level deep (this was one reason I used to prefer LiveJournal over blogspot back in the day – comments could be nested in the former, allowing for significantly superior discussions).

Anyway, the point of this post is not about threads. It’s about another bug/feature of Slack which makes it an extremely difficult tool to use, especially for people like me.

The problem is slack is that it nudges you towards sending shorter messages rather than longer messages. In fact, there’s no facility at all to send a long well-constructed argument unless you keep holding on to Shift+Enter everytime you need a new line. There is a “insert text snippet” feature, but that lacks richness of any kind – like bullet points, for example.

What this does is to force you to use Slack for quick messages only, or only share summaries. It’s possible that this is a design feature, intended to capture the lack of attention span of the “twitter generation”, but it makes it an incredibly hard platform to use to have real discussions.

And when Slack is the primary mode of communication in your company (some organisations have effectively done away with email for internal communications, preferring to put everything on Slack), there is no way at all to communicate nuance.

PS: It’s possible that the metric for someone at Slack is “number of messages sent”. And nudging users towards writing shorter messages can mean more messages are sent!

PS2: DBabble allowed for plenty of nuance, with plenty of space to write your messages and arguments.

 

Coin change problem with change – Dijkstra’s Algorithm

The coin change problem is a well studied problem in Computer Science, and is a popular example given for teaching students Dynamic Programming. The problem is simple – given an amount and a set of coins, what is the minimum number of coins that can be used to pay that amount?

So, for example, if we have coins for 1,2,5,10,20,50,100 (like we do now in India), the easiest way to pay Rs. 11 is by using two coins – 10 and 1. If you have to pay Rs. 16, you can break it up as 10+5+1 and pay it using three coins.

The problem with the traditional formulation of the coin change problem is that it doesn’t involve “change” – the payer is not allowed to take back coins from the payee. So, for example, if you’ve to pay Rs. 99, you need to use 6 coins (50+20+20+5+2+2). On the other hand, if change is allowed, Rs. 99 can be paid using just 2 coins – pay Rs. 100 and get back Re. 1.

So how do you determine the way to pay using fewest coins when change is allowed? In other words, what happens to the coin change problems when negative coins can be used? (Paying 100 and getting back 1 is the same as paying 100 and (-1) ) .

Unfortunately, dynamic programming doesn’t work in this case, since we cannot process in a linear order. For example, the optimal way to pay 9 rupees when negatives are allowed is to break it up as (+10,-1), and calculating from 0 onwards (as we do in the DP) is not efficient.

For this reason, I’ve used an implementation of Dijkstra’s algorithm to determine the minimum number of coins to be used to pay any amount when cash back is allowed. Each amount is a node in the graph, with an edge between two amounts if the difference in amounts can be paid using a single coin. So there is an edge between 1 and 11 because the difference (10) can be paid using a single coin. Since cash back is allowed, the graph need not be directed.

So all we need to do to determine the way to pay each amount most optimally is to run Dijkstra’s algorithm starting from 0. The breadth first search has complexity $latex O(M^2 n)$ where M is the maximum amount we want to pay, while n is the number of coins.

I’ve implemented this algorithm using R, and the code can be found here. I’ve also used the algorithm to compute the number of coins to be used to pay all numbers between 1 and 10000 under different scenarios, and the results of that can be found here.

You can feel free to use this algorithm or code or results in any of your work, but make sure you provide appropriate credit!

PS: I’ve used “coin” here in a generic sense, in that it can mean “note” as well.

Explaining UPI

I just paid my cook his salary for November. Given the cash crunch, I paid him through a bank transfer, using IMPS. Earlier today, my wife had asked him for his account details (last month I’d paid him on his wife’s account).

An hour back he sent me his account details (including account number and IFSC) via WhatsApp. I had to wait till I got home and got access to my laptop (Citibank app doesn’t let you add payees on mobile banking).

I get home, log in to Citibank Online. Add payee, which includes typing his bank account number twice. Get SMS asking me to confirm payee addition. I authorise payee. And after all this I am able to finally do the transfer – and I expect him to have got his money already.

For a long time I was wondering what the big deal with UPI was, given that IMPS is already fast enough. Having finally tried UPI earlier this week (it’s finally coming to iOS, but only available on ICICI now. And the implementation so far sucks, since you need to pull out your debit card for two factor authentication – defeating the point of UPI. I’m told it’s better on Android), I realise how much easier and safer the transaction would’ve been.

Firstly, the cook needn’t have sent me his account number. All I would need was his virtual payment address. I would then open my UPI app (in my case, iMobile) and click on “send money”. And then I’d add his virtual ID there, following which his name would appear. Two or three more clicks, and entering my PIN code, the transfer would be done.

No bank account number. Not even a mobile number or an email ID. Just a random string of characters would allow me to transfer money to him! And later I could give him my UPI ID, and next month onwards he could simply send me a request via UPI for his salary. And two clicks later it would be done!

Mint has reported that there are massive delays in merchants installing point of sale devices in response to the cash ban. Banks should instead seek to acquire merchants to accept money via UPI. It’s simple, it’s quick and it protects privacy.

In fact, if the bank sales staff now have bandwidth, it can be argued that all the planets have aligned for UPI to take off for merchant payments – people have less cash, point of sale devices are not available, and both merchants and shoppers have shown openness to cashless payments, and there is a push from the government.

If only the banks can bite…

Siri Apps

So now that Apple has opened up Siri to third party developers (starting with iOS 10), I hope that app makers make use of this features clearly.

There are some apps, for example, that require inputs from time to time. My wife, for example, recently downloaded an app to track our daughter’s inputs and outputs.

The problem was that each time our daughter either input or output something, my wife had to unlock her iPhone, go to this app, and then use 3/4 more clicks to enter the necessary information. The complexity was so much that within a day she (wife) had stopped using it, and presently uninstalled the app.

This kind of app that needs frequent inputs seems ideally placed to get Siri integration. Imagine saying “Hey Siri. My baby just shat”, and the app gets immediately updated, and stuff. Given how powerful Siri already is, I don’t see any technical challenge in implementing this. The challenge, though, is on the product front, for apps to be able to use this judiciously.

PS: I just updated my laptop to MacOS Sierra, and so now have Siri on my Mac as well. only problem is that Siri on phone is so much more responsive than Siri on Mac. So when I want to call the Mac Siri, the phone Siri responds first, annoying me!