Networking events and positions of strength

This replicates some of the stuff I wrote in a recent blog post, but I put this on LinkedIn and wanted a copy here for posterity 

Having moved my consulting business to London earlier this year, I’ve had a problem with marketing. The basic problem is that while my network and brand is fairly strong in India, I’ve had to start from scratch in the UK.

The lack of branding has meant that I have often had to talk or negotiate from a position of weakness (check out my recent blog post on branding as creating a position of strength). The lack of network has meant that I try to go to networking events where I can meet people and try to improve my network. Except that the lack of branding means that I have to network from a position of weakness and hence not make an impact.

A few months back I came across this set of tweets by AngelList founder Naval Ravikant, in which he talked about productivity hacks.

One that caught my eye, which I try to practice but have not always been able to practice, is on not going to conferences if you are not speaking. However, now that I think about it from the point of view of branding and positions of strength, what he says makes total sense.

In conferences and networking events, there is usually a sort of unspoken hierarchy, where speakers are generally “superior” to those in the audience. This flows from the assumption that the audience has come to gather pearls of wisdom from the speakers. And this has an impact on the networking around the event – if you are speaking, people will start with the prior of your being a superior being, compared to you going as an audience member (especially if it is a paid event).

This is not a strict rule – when there are other people at the event who you know, it is possible that their introductions can elevate you even if you are not speaking. However, if you are at an event where you don’t know anyone else, you surely start on higher ground (no pun intended) in case you are speaking.

There is another advantage that speaking offers – you can use your speech itself to build your brand, which will be fresh in your counterparties’s minds in the networking immediately afterward. Audience members have no such brand-building ability, apart from the possibility of tarnishing their own brands through inappropriate or rambling questions.

So unless you see value in what the speaker(s) say, don’t go to conferences. Putting it another way, don’t go to conferences for networking alone, unless you are speaking. Extending this, don’t go to networking events unless you either know some of the other people who are coming there (whose links you can then tap) or if there is an opportunity for you to elevate your brand at the event (by speaking, for example).

PS: Some of Naval’s other points such as having “meeting days” and scheduling meetings for later in the day are pertinent as well, and I’ve found them to be incredibly useful.

Triangle marketing

This blog post is based more on how I have bought rather than how I have sold. The basic concept is that when you hear about a product or service from two or more independent sources, you are more likely to buy it.

The threshold varies by the kind of product you are looking at. When it is a low touch item like a book, two independent recommendations are enough. When it involves higher cost and has higher impact, like a phone, it might be five recommendations. For something life changing like a keto diet, it might be ten (I must mention I tried keto for half a day and gave up, not least because I figured I don’t really need it – I’m barely 3-4 kg overweight).

The important point to note is that the recommendations need to come from independent sources – if two people who you didn’t expect to have a similar taste in books were to recommend the same book, the second of these recommendations is likely to create an “aha moment” (ok I’m getting into consultant-speak now), and that is likely to drive a purchase (or at least trying a Kindle sample).

In some ways, exposure to the same product through independent sources is likely to create a feeling of a self-fulfilling prophecy. “Alice is also using this. Bob is also using this” will soon go into “everybody seems to be using it. I should also use it”.

So what does this mean to you if you are a seller? Basically you need to hit your target audience through various channels. I had mentioned in my post earlier this week about how branding creates a “position of strength“, and how direct sales is normally hard because it is done through a position of weakness.

The idea is that before you hit your audience with a direct sale, you need to “warm them up” with your brand, and you need to do this through various channels. Your brand needs to impact on your audience through multiple independent channels, so that it has become a self-fulfilling prophecy before you approach to make the sale.

What these precise channels are depends on your business and the product that you’re trying to sell, but the important thing is that they are independent. So for example, putting advertisements in various places won’t help since the target will treat all of them as coming from the same source.

Finally, where is the “triangle” in this marketing? It is in the idea that you complete the branding and sales by means of “triangulation”. You send out vectors in seemingly random directions trying to build your brand, and they will get reflected till a time when they intersect, or “triangulate”. Ok I know my maths here is messy ant not up to my usual standard, but I guess you know what I’m getting at!

 

Attractive graphics without chart junk

A picture is worth a thousand words, but ten pictures are worth much less than ten thousand words

One of the most common problems with visualisation, especially in the media, is that of “chart junk”. Graphics designers working for newspapers and television channels like to decorate their graphs, to make it more visually appealing. And in most cases, this results in the information in the graphs getting obfuscated and harder to read.

The commonest form this takes is in the replacement of bars in a simple bar graph with weird objects. When you want to show number of people in something, you show little people, sometimes half shaded out. Sometimes instead of having multiple people, the information is conveyed in the size of the people, or objects  (like below). 

Then, instead of using simple bar graphs, designers use more complicated structures such as 3-dimensional bar graphs, or cone graphs or doughnut charts (I’m sure I’ve abused some of them on my tumblr). All of them are visually appealing and can draw attention of readers or viewers. Most of them come at the cost of not really conveying the information!

I’ve spoken to a few professional graphic designers and asked them why they make poor visualisation choices even when the amount of information the graphics convey goes down. The most common answer is novelty – “a page full of bars can be boring for the reader”. So they try to spice it up by replacing bars with other items that “look different”.

Putting it another way, the challenge is two-fold – first you need to get your readers to look at your graph (here is where novelty helps). And once you’ve got them to look at it, you need to convey information to them. And the two objectives can sometimes collide, with the best looking graphs not being the ones that convey the best information. And this combination of looking good and being effective is possibly what turns visualisation into an art.

My way of dealing with this has been to play around with the non-essential bits of the visualisation. Using colours judiciously, for example. Using catchy headlines. Adding decorations outside of the graphs.

Another lesson I’ve learnt over time is to not have too many graphics in the same piece. Some of this has come due to pushback from my editors at Mint, who have frequently asked me to cut the number of graphs for space reasons. And some of this is something I’ve learnt as a reader.

The problem with visualisations is that while they can communicate a lot of information, they can break the flow in reading. So having too many visualisations in the piece means that you break the reader’s flow too many times, and maybe even risk your article looking academic. Cutting visualisations forces you to be concise in your use of pictures, and you leave in only the ones that are most important to your story.

There is one other upshot out of cutting the number of visualisations – when you have one bar graph and one line graph, you can leave them as they are and not morph or “decorate” them just for the heck of it!

PS: Even experienced visualisers are not immune to not having their graphics mangled by editors. Check out this tweet storm by Edward Tufte, the guru of visualisation.

Taking your audience through your graphics

A few weeks back, I got involved in a Twitter flamewar with Shamika Ravi, a member of the Indian Prime Minister’s Economic Advisory Council. The object of the argument was a set of gifs she had released to show different aspects of the Indian economy. Admittedly I started the flamewar. Guilty as charged.

Thinking about it now, this wasn’t the first time I was complaining about her gifs – I began my now popular (at least on Twitter) Bad Visualisations tumblr with one of her gifs.

So why am I so opposed to animated charts like the one in the link above? It is because they demand too much of the consumer’s attention and it is hard to get information out of them. If there is something interesting you notice, by the time you have had time to digest the information the graphic has moved several frames forward.

Animated charts became a thing about a decade ago following the late Hans Rosling’s legendary TED Talk. In this lecture, Rosling used “motion charts” (a concept he possibly invented) – which was basically a set of bubbles moving around a chart, as he sought to explain how the condition of the world has improved significantly over the years.

It is a brilliant talk. It is a very interesting set of statistics simply presented, as Rosling takes the viewers through them. And the last phrase is the most important – these motion charts work for Rosling because he talks to the audience as the charts play out. He pauses when there is some explanation to be made or the charts are at a key moment. He explains some counterintuitive data points exhibited by the chart.

And this is precisely how animated visualisations need to be done, and where they work – as part of a live presentation where a speaker is talking along with the charts and using them as visual aids. Take Rosling (or any other skilled speaker) away from the motion charts, though, and you will see them fall flat – without knowing what the key moments in the chart are, and without the right kind of annotations, the readers are lost and don’t know what to look for.

There are a large number of aids to speaking that can occasionally double up as aids to writing. Graphics and charts are one example. Powerpoint (or Keynote or Slides) presentations are another. And the important thing with these visual aids is that the way they work as an aid is very different from the way they work standalone. And the makers need to appreciate the difference.

In business school, we were taught to follow the 5 by 5 formula (or some such thing) while making slides – that a slide should have no more than five bullet points, and each point should have no more than five words. This worked great in school as most presentations we made accompanied our talks.

Once I started working (for a management consultancy), though, I realised this didn’t work there because we used powerpoint presentations as standalone written communications. Consequently, the amount of information on each slide had to be much greater, else the reader would fail to get any information out of it.

Conversely, a powerpoint presentation meant as a standalone document would fail spectacularly when used to accompany a talk, for there would be too much information on each slide, and massive redundancy between what is on the slide and what the speaker is saying.

The same classification applies to graphics as well. Interactive and animated graphics do brilliantly as part of speeches, since the speaker can control what the audience is seeing and make sure the right message gets across. As part of “print” (graphics shared standalone, like on Twitter), though, these graphics fail as readers fail to get information out of them.

Similarly, a dense well-annotated graphic that might do well in print can fail when used as a visual aid, since there will be too much information and audience will not be able to focus on either the speaker or the graphic.

It is all about the context.

Analytics for general managers

While good managers have always been required to be analytical, the level of analytical ability being asked of managers has been going up over the years, with the increase in availability of data.

Now, this post is once again based on that one single and familiar data point – my wife. In fact, if you want me to include more data in my posts, you should talk to me more.

Leaving that aside, my wife works as a mid-level manager for an extremely large global firm. She was recruited straight out of business school for a “MBA track” program. And from our discussions about her work in the first few months, one thing she did lots of was writing SQL queries. And she still spends a lot of her time writing queries and building Excel models.

This isn’t something she was trained for, or was tested on while being recruited. She did her MBA in a famously diverse global business school, the diversity of its student bodies implying the level of maths and quantitative methods being kept rather low. She was recruited as a “general manager”. Yet, in a famously data-driven company, she spends a considerable amount of time on quantitative stuff.

It wasn’t always like this. While analytical ability has what (in my opinion) set apart graduates of elite MBA programs from those of middling MBA programs, the level of quantitative ability expected out of MBAs (apart from maybe those in finance) wasn’t too high. You were expected to know to use spreadsheets. You were expected to know some rudimentary statistics- means and standard deviations and some basic hypothesis testing, maybe. And you were expected to be able to make managerial decisions based on numbers. That’s about it.

Over the years, though, as the corpus of data within (and outside) organisations has grown, and making decisions based on data has become fashionable (a brilliant thing as far as I’m concerned), the requirement from managers has grown as well. Now they are expected to do more with data, and aren’t always trained for that.

Some organisations have responded to this problem by supplying “data analysts” who are attached to mid level managers, so that the latter can outsource the analytical work to the former and spend most of their time on “managerial” stuff. The problem with this is twofold – it is hard to guarantee a good career path to this data analyst (which makes recruitment hard), and this introduces “friction” – the manager needs to tell the analyst what precise data and analysis she needs, and iterating on this can lead to a lot of time lost.

Moreover, as the size of the data has grown, the complexity of the analysis that can be done and the insights that can be produced has become greater as well. And in that sense, managers who have been able to adapt to the volume and complexity of data have a significant competitive advantage over their peers who are less comfortable with data.

So what does all this mean for general managers and their education? First, I would expect the smarter managers to know that data analysis ability is a competitive advantage, and so invest time in building that skill. Second, I know of some business schools that are making their MBA programs less quantitative, as their student body becomes more diverse and the recruitment body becomes less diverse (banks are recruiting far less nowadays). This is a bad move. In fact, business schools need to realise that a quantitative MBA program is more of a competitive advantage nowadays, and tune their programs accordingly, while not compromising on the diversity of the student intake.

Then, there is a generation of managers that got along quite well without getting its hands dirty with data. These managers will now get challenged by younger managers who are more conversant with data. It will be interesting to see how organisations deal with this dynamic.

Finally, organisations need to invest in training programs, to make sure that their general managers are comfortable with data, and analysis, and making use of internal and external data science resources. Interestingly enough (I promise I hadn’t thought of this when I started writing this post), my company offers precisely one such workshop. Get in touch if you’re interested!

The missing middle in data science

Over a year back, when I had just moved to London and was job-hunting, I was getting frustrated by the fact that potential employers didn’t recognise my combination of skills of wrangling data and analysing businesses. A few saw me purely as a business guy, and most saw me purely as a data guy, trying to slot me into machine learning roles I was thoroughly unsuited for.

Around this time, I happened to mention to my wife about this lack of fit, and she had then remarked that the reason companies either want pure business people or pure data people is that you can’t scale a business with people with a unique combination of skills. “There are possibly very few people with your combination of skills”, she had said, and hence companies had gotten around the problem by getting some very good business people and some very good data people, and hope that they can add value together.

More recently, I was talking to her about some of the problems that she was dealing with at work, and recognised one of them as being similar to what I had solved for a client a few years ago. I quickly took her through the fundamentals of K-means clustering, and showed her how to implement it in R (and in the process, taught her the basics of R). As it had with my client many years ago, clustering did its magic, and the results were literally there to see, the business problem solved. My wife, however, was unimpressed. “This requires too much analytical work on my part”, she said, adding that “If I have to do with this level of analytical work, I won’t have enough time to execute my managerial duties”.

This made me think about the (yet unanswered) question of who should be solving this kind of a problem – to take a business problem, recognise it can be solved using data, figuring out the right technique to apply to it, and then communicating the results in a way that the business can easily understand. And this was a one-time problem, not something you would need to solve repeatedly, and so without the requirement to set up a pipeline and data engineering and IT infrastructure around it.

I admit this is just one data point (my wife), but based on observations from elsewhere, managers are usually loathe to get their hands dirty with data, beyond perhaps doing some basic MS Excel work. Data science specialists, on the other hand, will find it hard to quickly get intuition for a one-time problem, get data in a “dirty” manner, and then apply the right technique to solving it, and communicate the results in a business-friendly manner. Moreover, data scientists are highly likely to be involved in regular repeatable activities, making it an organisational nightmare to “lease” them for such one-time efforts.

This is what I call as the “missing middle problem” in data science. Problems whose solutions will without doubt add value to the business, but which most businesses are unable to address because of a lack of adequate skillset in solving the issue; and whose one-time nature makes it difficult for businesses to dedicate permanent resources to solve.

I guess so far this post has all the makings of a sales pitch, so let me turn it into one – this is precisely the kind of problem that my company Bespoke Data Insights is geared to solving. We specialise in solving problems that lie at the cusp of business and data. We provide end-to-end quantitative solutions for typically one-time business problems.

We come in, understand your business needs, and use a hypothesis-driven approach to model the problem in data terms. We select methods that in our opinion are best suited for the precise problem, not hesitating to build our own models if necessary (hence the Bespoke in the name). And finally, we synthesise the analysis in the form of recommendations that any business person can easily digest and action on.

So – if you’re facing a business problem where you think data might help, but don’t know how to proceed; or if you are curious about all this talk about AI and ML and data science and all that, and want to include it in your business; or you want your business managers to figure out how to use the data  teams better, hire us.

Statistics and machine learning approaches

A couple of years back, I was part of a team that delivered a workshop in machine learning. Given my background, I had been asked to do a half-day session on Regression, and was told that the standard software package being used was the scikit-learn package in python.

Both the programming language and the package were new to me, so I dug around a few days before the workshop, trying to figure out regression. Despite my best efforts, I couldn’t locate how to find out the R^2. What some googling told me was surprising:

There exists no R type regression summary report in sklearn. The main reason is that sklearn is used for predictive modelling / machine learning and the evaluation criteria are based on performance on previously unseen data

As it happened, I requested the students at the workshop to install a package called statsmodels, which provides standard regression outputs. And then I proceeded to lecture to them on regression as I know it, including significance scores, p values, t statistics, multicollinearity and the likes. It was only much later was I to figure out that that is now how regression (and logistic regression) is done in the machine learning world.

In a statistical framework, the data sets in regression are typically “long” – you have a large number of data points, and a small number of variables. Putting it differently, we start off with a model with few degrees of freedom, and then “constrain” the variables with a large enough number of data points, so that if a signal exists, and it is in the right format (linear relationship and all that), we can pin it down effectively.

In a machine learning framework, it is common to run a regression where the number of data points is of the same order of magnitude as, or even smaller than the number of variables. Strictly speaking, such a problem is unbounded (there are too many degrees of freedom), and so regression is not well-defined. Instead, we rely upon “regularisation methods” to “tie down” the variables and (hopefully) produce a consistent solution.

Moreover, machine learning approaches are common to problems where individual predictor variables don’t have meaning. In this scenario, knowing whether a particular variable is significant or not is of no utility. Then, the signal in machine learning lies in the combination of variables, which means that multicollinearity (correlation between predictor variables) is not really a bad thing as it is in statistics. Variables not having meanings means that there are no correlations per se to be defined, and so machine learning models are harder to interpret, and are more likely to have hidden spurious correlations.

Also, when you have a small number of variables and a large number of data points, it is easy to get an “exact solution” for regression, which is what statistical methods use. In a machine learning framework with “wide” data, though, exact solutions are computationally infeasible, and so you need to use approximate algorithms such as gradient descent – which are common across ML techniques.

All in all, while statistics and machine learning might use techniques with the same name (“regression”, for example), they are both in theory and practice, very different ways to solve the problem. The important thing is to figure out the approach most suited for a particular problem, and use it accordingly.