Maths, machine learning, brute force and elegance

Back when I was at the International Maths Olympiad Training Camp in Mumbai in 1999, the biggest insult one could hurl at a peer was to describe the latter’s solution to a problem as being a “brute force solution”. Brute force solutions, which were often ungainly, laboured and unintuitive were supposed to be the last resort, to be used only if one were thoroughly unable to implement an “elegant solution” to the problem.

Mathematicians love and value elegance. While they might be comfortable with esoteric formulae and the Greek alphabet, they are always on the lookout for solutions that are, at least to the trained eye, intuitive to perceive and understand. Among other things, it is the belief that it is much easier to get an intuitive understanding for an elegant solution.

When all the parts of the solution seem to fit so well into each other, with no loose ends, it is far easier to accept the solution as being correct (even if you don’t understand it fully). Brute force solutions, on the other hand, inevitably leave loose ends and appreciating them can be a fairly massive task, even to trained mathematicians.

In the conventional view, though, non-mathematicians don’t have much fondness for elegance. A solution is a solution, and a problem solved is a problem solved.

With the coming of big data and increased computational power, however, the tables are getting turned. In this case, the more mathematical people, who are more likely to appreciate “machine learning” algorithms recommend “leaving it to the system” – to unleash the brute force of computational power at the problem so that the “best model” can be found, and later implemented.

And in this case, it is the “half-blood mathematicians” like me, who are aware of complex algorithms but are unsure of letting the system take over stuff end-to-end, who bat for elegance – to look at data, massage it, analyse it and then find that one simple method or transformation that can throw immense light on the problem, effectively solving it!

The world moves in strange ways.

Hedgehogs and foxes: Or, a day in the life of a quant

I must state at the outset that this post is inspired by the second chapter of Nate Silver’s book The Signal and the Noise. In that chapter, which is about election forecasting, Silver draws upon the old Russian parable of the hedgehog and the fox. According to that story, the fox knows several tricks while the hedgehog knows only one – curling up into a ball. The story ends in favour of the hedgehog, as none of the tricks of the unfocused fox can help him evade the predator.

Most political pundits, says Silver, are like hedgehogs. They have just one central idea to their punditry and they tend to analyze all issues through that. A good political forecaster, however, needs to be able to accept and process any new data inputs, and include that in his analysis. With just one technique, this can be hard to achieve and so Silver says that to be a good political forecaster one needs to be a fox. While this might lead to some contradictory statements and thus bad punditry, it leads to good forecasts. Anyway, you can know about election forecasting from Silver’s book.

The world of “quant” and “analytics” which I inhabit is again similarly full of hedgehogs. You have the statisticians, whose solution for every problem is a statistical model. They can wax eloquent about Log Likelihood Estimators but can have trouble explaining why you should use that in the first place. Then you have the banking quants (I used to be one of those), who are proficient in derivatives pricing, stochastic calculus and partial differential equations, but if you ask them why a stock price movement is generally assumed to be lognormal, they don’t have answers. Then you have the coders, who can hack, scrape and write really efficient code, but don’t know much math. And mathematicians who can come up with elegant solutions but who are divorced from reality.

While you might make a career out of falling under any of the above categories, to truly unleash your potential as a quant, you should be able to do all. You should be a fox and should know each of these  tricks. And unlike the fox in the Old Russian fairy tale, the key to being a good fox is to know what trick to use when. Let me illustrate this with an example from my work today (actual problem statement masked since it involves client information).

So there were two possible distributions that a particular data point could have come from and I had to try and analyze which of them it came from (simple Bayesian probability, you might think). However, calculating the probability wasn’t so straightforward, as it wasn’t a standard function. Then I figured I could solve the probability problem using the inclusion-exclusion principle (maths again), and wrote down a mathematical formulation for it.

Now, I was dealing with a rather large data set, so I would have to use the computer, so I turned my mathematical solution into pseudo-code. Then, I realized that the pseudo-code was recursive, and given the size of the problem I would soon run out of memory. I had to figure out a solution using dynamic programming. Then, following some more code optimization, I had the probability. And then I had to go back to do the Bayesian analysis in order to complete the solution. And then present the solution in the form of a “business solution”, with all the above mathematical jugglery being abstracted from the client.

This versatility can come in handy in other places, too. There was a problem for which I figured out that the appropriate solution involved building a classification tree. However, given the nature of the data at hand, none of the off-the-shelf classification tree algorithms for were ideal. So I simply went ahead and wrote my own code for creating such trees. Then, I figured that classification trees are in effect a greedy algorithm, and can lead to getting stuck at local optima. And so I put in a simulated annealing aspect to it.

While I may not have in depth knowledge of any of the above techniques (to gain breadth you have to sacrifice depth), that I’m aware of a wide variety of techniques means I can provide the solution that is best for the problem at hand. And as I go along, I hope to keep learning more and more techniques – even if I don’t use them, being aware of them will lead to better overall problem solving.


This is my first ever handwritten post. Wrote this using a Natraj 621 pencil in a notebook while involved in an otherwise painful activity for which I thankfully didn’t have to pay much attention to. I’m now typing it out verbatim from what I’d written. There might be inaccuracies because I have a lousy handwriting. I begin

People like models. People like models because it gives them a feeling of being in control. When you observe a completely random phenomenon, financial or otherwise, it causes a feeling of unease. You feel uncomfortable that there is something that is beyond the realm of your understanding, which is inherently uncontrollable. And so, in order to get a better handle of what is happening, you resort to a model.

The basic feature of models is that they need not be exact. They need not be precise. They are basically a broad representation of what is actually happening, in a form that is easily understood. As I explained above, the objective is to describe and understand something that we weren’t able to fundamentally comprehend.

All this is okay but the problem starts when we ignore the assumptions that were made while building the model, and instead treat the model as completely representative of the phenomenon it is supposed to represent. While this may allow us to build on these models using easily tractable and precise mathematics, what this leads to is that a lot of the information that went into the initial formulation is lost.

Mathematicians are known for their affinity towards precision and rigour. They like to have things precisely defined, and measurable. You are likely to find them going into a tizzy when faced with something “grey”, or something not precisely measurable. Faced with a problem, the first thing the mathematician will want to do is to define it precisely, and eliminate as much of the greyness as possible. What they ideally like is a model.

From the point of view of the mathematician, with his fondness for precision, it makes complete sense to assume that the model is precise and complete. This allows them to bringing all their beautiful math without dealing with ugly “greyness”. Actual phenomena are now irrelevant.The model reigns supreme.

Now you can imagine what happens when you put a bunch of mathematically minded people on this kind of a problem. And maybe even create an organization full of them. I guess it is not hard to guess what happens here – with a bunch of similar thinking people, their thinking becomes the orthodoxy. Their thinking becomes fact. Models reign supreme. The actual phenomenon becomes a four-letter word. And this kind of thinking gets propagated.

Soon the people fail to  see beyond the models. They refuse to accept that the phenomenon cannot obey their models. The model, they think, should drive the phenomenon, rather than the other way around. The tails wagging the dog, basically.

I’m not going into the specifics here, but this might give you an idea as to why the financial crisis happened. This might give you an insight into why obvious mistakes were made, even when the incentives were loaded in favour of the bankers getting it right. This might give you an insight as to why internal models in Moody’s even assumed that housing prices can never decrease.

I think there is a lot more that can be explained due to this love for models and ignorance of phenomena. I’ll leave them as an exercise to the reader.

Apart from commenting about the content of this post, I also want your feedback on how I write when I write with pencil-on-paper, rather than on a computer.


Addition to the Model Makers Oath

Paul Wilmott and Emanuel Derman, in an article in Business Week a couple of years back (at the height of the financial crisis) came up with a model-makers oath. It goes:

• I will remember that I didn’t make the world and that it doesn’t satisfy my equations.

• Though I will use models boldly to estimate value, I will not be overly impressed by mathematics.

• I will never sacrifice reality for elegance without explaining why I have done so. Nor will I give the people who use my model false comfort about its accuracy. Instead, I will make explicit its assumptions and oversights.

• I understand that my work may have enormous effects on society and the economy, many of them beyond my comprehension.

While I like this, and try to abide by it, I want to add another point to the oath:

As a quant, it is part of my responsibility that my fellow-quants don’t misuse quantitative models in finance and bring disrepute to my profession. It is my responsibility that I’ll put in my best efforts to be on the lookout for deviant behavour on the part of other quants, and try my best to ensure that they too adhere to these principles.

Go read the full article in the link above (by Wilmott and Derman). It’s a great read. And coming back to the additional point I’ve suggested here, I’m not sure I’ve drafted it concisely enough. Help in editing and making it more concise and precise is welcome.