The (missing) Desk Quants of Main Street

A long time ago, I’d written about my experience as a Quant at an investment bank, and about how banks like mine were sitting on a pile of risk that could blow up any time soon.

There were two problems as I had documented then. Firstly, most quants I interacted with seemed to be solving maths problems rather than finance problems, not bothering if their models would stand the test of markets. Secondly, there was an element of groupthink, as quant teams were largely homogeneous and it was hard to progress while holding contrarian views.

Six years on, there has been no blowup, and in some sense banks are actually doing well (I mean, they’ve declined compared to the time just before the 2008 financial crisis but haven’t done that badly). There have been no real quant disasters (yes I know the Gaussian Copula gained infamy during the 2008 crisis, but I’m talking about a period after that crisis).

There can be many explanations regarding how banks have not had any quant blow-ups despite quants solving for math problems and all thinking alike, but the one I’m partial to is the presence of a “middle layer”.

Most of the quants I interacted with were “core” in the sense that they were not attached to any sales or trading desks. Banks also typically had a large cadre of “desk quants” who are directly associated with trading teams, and who build models and help with day-to-day risk management, pricing, etc.

Since these desk quants work closely with the business, they turn out to be much more pragmatic than the core quants – they have a good understanding of the market and use the models more as guiding principles than as rules. On the other hand, they bring the benefits of quantitative models (and work of the core quants) into day-to-day business.

Back during the financial crisis, I’d jokingly predicted that other industries should hire quants who were now surplus to Wall Street. Around the same time, DJ Patil et al came up with the concept of the “data scientist” and called it the “sexiest job of the 21st century”.

And so other industries started getting their own share of quants, or “data scientists” as they were now called. Nowadays its fashionable even for small companies for whom data is not critical for business to have a data science team. Being in this profession now (I loathe calling myself a “data scientist” – prefer to say “quant” or “analytics”), I’ve come across quite a few of those.

The problem I see with “data science” on “Main Street” (this phrase gained currency during the financial crisis as the opposite of Wall Street, in that it referred to “normal” businesses) is that it lacks the cadre of desk quants. Most data scientists are highly technical people who don’t necessarily have an understanding of the business they operate in.

Thanks to that, what I’ve noticed is that in most cases there is a chasm between the data scientists and the business, since they are unable to talk in a common language. As I’m prone to saying, this can go two ways – the business guys can either assume that the data science guys are geniuses and take their word for the gospel, or the business guys can totally disregard the data scientists as people who do some esoteric math and don’t really understand the world. In either case, value added is suboptimal.

It is not hard to understand why “Main Street” doesn’t have a cadre of desk quants – it’s because of the way the data science industry has evolved. Quant at investment banks has evolved over a long period of time – the Black-Scholes equation was proposed in the early 1970s. So the quants were first recruited to directly work with the traders, and core quants (at the banks that have them) were a later addition when banks realised that some quant functions could be centralised.

On the other hand, the whole “data science” growth has been rather sudden. The volume of data, cheap incrementally available cloud storage, easy processing and the popularity of the phrase “data science” have all increased well-at-a-faster rate in the last decade or so, and so companies have scrambled to set up data teams. There has simply been no time to train people who get both the business and data – and the data scientists exist like addendums that are either worshipped or ignored.

When a two-by-two ruins a scatterplot

The BBC has some very good analysis of the Brexit vote (how long back was that?), using voting data at the local authority level, and correlating it with factors such as ethnicity and educational attainment.

In terms of educational attainment, there is a really nice chart, that shows the proportion of voters who voted to leave against the proportion of population in the ward with at least a bachelor’s degree. One look at the graph tells you that the correlation is rather strong:

‘Source: http://www.bbc.com/news/uk-politics-38762034And then there is the two-by-two that is superimposed on this – with regions being marked off in pink and grey. The idea of the two-by-two must have been to illustrate the correlation – to show that education is negatively correlated with the “leave” vote.

But what do we see here? A majority of the points lie in the bottom left pink region, suggesting that wards with lower proportion of graduates were less likely to leave. And this is entirely the wrong message for the graph to send.

The two-by-two would have been useful had the points in the graph been neatly divided into clusters that could be arranged in a grid. Here, though, what the scatter plot shows is a nice negatively correlated linear relationship. And by putting those pink and grey boxes, the illustration is taking attention away from that relationship.

Instead, I’d simply put the scatter plot as it is, and maybe add the line of best fit, to emphasise the negative correlation. If I want to be extra geeky, I might also write down the R^2 next to the line, to show the extent of correlation!

 

Financial inclusion and cash

Varad Pande and Nirat Bhatnagar have an interesting Op-Ed today in Mint about financial inclusion, and about how financial institutions haven’t been innovative to make products that are suited to the poor, and how better user interface can also drive financial inclusion. I found this example they took rather interesting:

Take, for instance, a daily wager who makes Rs200 on the days she gets work. Work is unpredictable, and expenses too can be volatile, so she has to borrow money for buying vegetables, or to pay the doctor’s fees when her children fall sick. Her real need is for a flexible—small ticket, variable amount, rapid approval—loan product that she can access instantly. Unfortunately, no institutional channel—neither the public sector bank where she has a “no frills” account, nor the MFI that she has previously borrowed from—offers such a product. She ends up borrowing from neighbours, often from the local moneylender.

Now, based on my experience in FinTech, it is not hard to design a loan product for someone whose cash flows are known. The bank statement is nothing but a continuing story of the account holder’s life, and if you can understand the cash flows (both in and out) for a reasonable period of time, it is straightforward to design a loan product that fits that cash flow pattern.

The key thing, however, is that you need to have full information on transactions, in terms of when cash comes in and goes out, what the cash outflow is used for, and all that. And that is where the cash economy is a bit of a bummer.

For a banker who is trying to underwrite, and decide the kind of loan product (and interest rate) to offer to a customer, the customer’s cash transactions obscure information; information that could’ve been used by the bank to design/structure/recommend the appropriate product for the customer.

For the case that Pande and Bhatnagar take, if all inflows and outflows are in cash, there is little beyond the potential borrower’s word that can convince bankers of the borrower’s creditworthiness. And so the potential borrower is excluded from the system.

If, on the other hand, the potential borrower were to have used non-cash means for all her transactions, bankers would have had a full picture of her life, and would have been able to give her an appropriate loan!

In this sense, I think so far financial inclusion has been going on ass-backwards, with most microfinance institutions (MFIs) targeting loans rather than deposits. And with little data to base credit on, it’s resulted in wide credit spreads and interest rates that might be seen as usurious.

Instead, if banks and MFIs had gone the other way, first getting customers to deposit, and then use the bank account for as much of their transactions as possible, it would have been possible to design much better financial products, and include more customers!

The current disruption in the cash economy possibly offers banks and MFIs a good chance to rectify their errors so far!

Intermediation and the battle for data

The Financial Times reports ($) that thanks to the rise of AliPay and WeChat’s payment system, China’s banks are losing significantly in terms of access to customer data. This is on top of the $20Billion or so they’re losing directly in terms of fees because of these intermediaries.

But when a consumer uses Alipay or WeChat for payment, banks do not receive data on the merchant’s name and location. Instead, the bank record simply shows the recipient as Alipay or WeChat.

The loss of data poses a challenge to Chinese banks at a time when their traditional lending business is under pressure from interest-rate deregulation, rising defaults, and the need to curb loan growth following the credit binge. Big data are seen as vital to lenders’ ability to expand into new business lines.

I had written about this earlier on my blog about how intermediaries such as Swiggy or Grofers, by offering a layer between the restaurant/shop and consumer, now have access to the consumer’s data which earlier resided with the retailer.

What is interesting is that before businesses realised the value of customer data, they had plenty of access to such data and were doing little to leverage and capitalise on it. And now that people are realising the value of data, new intermediaries that are coming in are capturing the data instead.

From this perspective, the Universal Payment Interface (UPI) that launched last week is a key step for Indian banks to hold on to customer data which they could have otherwise lost to payment wallet companies.

Already, some online payments are listed on my credit card statement in the name of the payment gateway rather than in the name of the merchant, denying the credit card issuers data on the customer’s spending patterns. If the UPI can truly take off as a successor to credit cards (rather than wallets), banks can continue to harness customer data.

Mike Hesson and cricket statistics

While a lot is made of the use of statistics in cricket, my broad view based on presentation of statistics in the media and the odd player/coach interview is that cricket hasn’t really learnt how to use statistics as it should. A lot of so-called insights are based on small samples, and coaches such as Peter Moores have been pilloried for their excess focus on data.

In this context, I found this interview with New Zealand coach Mike Hesson in ESPNCricinfo rather interesting. From my reading of the interview, he seems to “get” data and how to use it, and helps explain the general over-performance to expectations of the New Zealand cricket team in the last few years.

Some snippets:

You’re trying to look at trends rather than chuck a whole heap of numbers at players.

For example, if you look at someone like Shikhar Dhawan, against offspin, he’s struggled. But you’ve only really got a nine or ten-ball sample – so you’ve got to make a decision on whether it’s too small to be a pattern

Also, players take a little while to develop. You’re trying to select the player for what they are now, rather than what their stats suggest over a two or three-year period.

And there are times when you have to revise your score downwards. In our first World T20 match, in Nagpur, we knew it would slow up,

 

Go ahead and read the whole thing.

Restaurants, deliveries and data

Delivery aggregators are moving customer data away from the retailer, who now has less knowledge about his customer. 

Ever since data collection and analysis became cheap (with cloud-based on-demand web servers and MapReduce), there have been attempts to collect as much data as possible and use it to do better business. I must admit to being part of this racket, too, as I try to convince potential clients to hire me so that I can tell them what to do with their data and how.

And one of the more popular areas where people have been trying to use data is in getting to “know their customer”. This is not a particularly new exercise – supermarkets, for example, have been offering loyalty cards so that they can correlate purchases across visits and get to know you better (as part of a consulting assignment, I once sat with my clients looking at a few supermarket bills. It was incredible how much we humans could infer about the customers by looking at those bills).

The recent tradition (after it has become possible to analyse large amounts of data) is to capture “loyalties” across several stores or brands, so that affinities can be tracked across them and customer can be understood better. Given data privacy issues, this has typically been done by third party agents, who then sell back the insights to the companies whose data they collect. An early example of this is Payback, which links activities on your ICICI Bank account with other products (telecom providers, retailers, etc.) to gain superior insights on what you are like.

Nowadays, with cookie farming on the web, this is more common, and you have sites that track your web cookies to figure out correlations between your activities, and thus infer your lifestyle, so that better advertisements can be targeted at you.

In the last two or three years, significant investments have been made by restaurants and retailers to install devices to get to know their customers better. Traditional retailers are being fitted with point-of-sale devices (provision of these devices is a highly fragmented market). Restaurants are trying to introduce loyalty schemes (again a highly fragmented market). This is all an attempt to better get to know the customer. Except that middlemen are ruining it.

I’ve written a fair bit on middleman apps such as Grofers or Swiggy. They are basically delivery apps, which pick up goods for you from a store and deliver it to your place. A useful service, though as I suggest in my posts linked above, probably overvalued. As the share of a restaurant or store’s business goes to such intermediaries, though, there is another threat to the restaurant – lack of customer data.

When Grofers buys my groceries from my nearby store, it is unlikely to tell the store who it is buying for. Similarly when Swiggy buys my food from a restaurant. This means loyalty schemes of these sellers will go for a toss. Of course not offering the same loyalty program to delivery companies is a no-brainer. But what the sellers are also missing out on is the customer data that they would have otherwise captured (had they sold directly to the customer).

A good thing about Grofers or Swiggy is that they’ve hit the market at a time when sellers are yet to fully realise the benefits of capturing customer data, so they may be able to capture such data for cheap, and maybe sell it back to their seller clients. Yet, if you are a retailer who is selling to such aggregators and you value your customer data, make sure you get your pound of flesh from these guys.

On apps tracking you and turning you into “lab rats”

Tech2, a division of FirstPost, reports that “Facebook could be tracking all rainbow profile pictures“. In what I think is a nonsensical first paragraph, the report says:

Facebook’s News Feed experiment received a huge blow from its social media networkers. With the new rainbow coloured profile picture that celebrates equality of marriage turned us into ‘lab rats’ again? Facebook is probably tracking all those who are using its new tool to change the profile picture, believes The Atlantic.

I’m surprised things like this still makes news. It is a feature (not a bug) of any good organisation that it learns from its user interactions and user behaviour, and hence tracking how users respond to certain kinds of news or updates is a fundamental part of how Facebook should behave.

And Facebook is a company that constantly improves and updates the algorithm it uses in order to decide what updates to show whom. And to do that, it needs to maintain data on who liked what, commented on what, and turned off what kind of updates. Collecting and maintaining and analysing such data is a fundamental, and critical, part of Facebook’s operations, and expecting them not to do so is downright silly (and it would be a downright silly act on part of the management if they stop experimenting or collecting data).

Whenever you sign on to an app or a service, you need to take it as a given that the app is collecting data and information from you. And that if you are not comfortable with this kind of data capture, you are better off not using the app. Of course, network effects mean that it is not that easy to live like you did in “the world until yesterday”.

This seems like yet another case of Radically Networked Outrage by outragers not having enough things to outrage about.