Studying on coursera

In the last one year or more I’ve signed up for and dropped out from at least a dozen coursera courses. The problem has been that the video lectures have not kept me engaged. I seem to multitask while watching these videos, and the sheer volume of videos in some of these lectures has been such that I’ve quickly fallen behind, and then lost interest. I must, however, admit that many of these courses haven’t been particularly challenging. In courses such as “model thinking” or “social network analysis” I’ve already known a lot of the stuff, and thus lost interest. Modern World History (by Philip Zelikow ) was more like an information-only course which I could have consumed better in the form of a book.

Given that I’ve had bursts of signing up for courses and then not following up on them, for the last six months I’ve avoided signing up for any new courses. Until two weeks back when, on a reasonably jobless evening during a visit to my client’s Mumbai office, I decided to sign up for this course on Asset Pricing. And what a course it has been so far!

I went to bed close to midnight last night. I watched neither the Champions League final nor Arsenal’s draw at West Brom. I was doing my assignments. I spent three hours on a Sunday evening doing my assignments of the coursera Asset Pricing course, offered by Prof John Cochrane of the University of Chicago.

I’ve only completed the assignments of “Week 0” of the eight-week long course, and have watched the lectures of “Week 1” and I’m hooked already. I must admit that nobody has taught me finance like this so far. In IIM Bangalore, where I got my MBA seven years ago, we had a course on microeconomics, a course on corporate finance and a course on financial derivatives (elective). The problem, however, was that nobody made the links between any of these.

We studied the concept of marginal utility in Economics, but none of the finance professors touched it. In corporate finance, we touched upon CAPM and Modigliani-Miller but none of the later finance courses referred to them. There was a derivation of the Black-Scholes pricing model in the course on derivatives, but that didn’t touch upon any other finance we had learnt. In short, we had just been provided with the components, and nobody had helped us connect it.

The beauty of the Chicago course is that it is holistic, and so well connected. The same professor, in the same course, teaches us diffusions while in another lecture uses the marginal utility theory from economics to explain the concept of interest rates. In an assignment he has got us to do regressions and in some others we do stochastic calculus. Having seen each of these concepts separately, I’m absolutely enjoying all the connections, and that is perhaps helping me keep my interest in the course.

And it is a challenging course. It is a PhD level course at Chicago (current students at the university are taking the course in parallel with us online students) and my complacency was shattered when I got 3.5 out of 11 in my first quiz. It assumes a certain proficiency in both finance and math, and then builds on it, in a way no finance course I’ve ever taken did.

Also what sets the course apart is the quality of the assignments. Each assignment makes you think, and make you do. For example, in one assignment I did last night I had to do a set of regressions and then report t values and R^2s. In another, I had to plot a graph (which I did using excel) and then report certain points from the graph. Some other assignments make sure you have internalized what was taught in the lectures. It has been extremely exciting so far.

Based on my experience with the course so far, I hope my enthusiasm will last. I don’t know if this course will help me directly professionally. However, there is no doubt that it keeps me intellectually honest and keeps me sharp. I might not have had the option to take too many such courses during my formal education. I hope i can set this right on Coursera.

Hedgehogs and foxes: Or, a day in the life of a quant

I must state at the outset that this post is inspired by the second chapter of Nate Silver’s book The Signal and the Noise. In that chapter, which is about election forecasting, Silver draws upon the old Russian parable of the hedgehog and the fox. According to that story, the fox knows several tricks while the hedgehog knows only one – curling up into a ball. The story ends in favour of the hedgehog, as none of the tricks of the unfocused fox can help him evade the predator.

Most political pundits, says Silver, are like hedgehogs. They have just one central idea to their punditry and they tend to analyze all issues through that. A good political forecaster, however, needs to be able to accept and process any new data inputs, and include that in his analysis. With just one technique, this can be hard to achieve and so Silver says that to be a good political forecaster one needs to be a fox. While this might lead to some contradictory statements and thus bad punditry, it leads to good forecasts. Anyway, you can know about election forecasting from Silver’s book.

The world of “quant” and “analytics” which I inhabit is again similarly full of hedgehogs. You have the statisticians, whose solution for every problem is a statistical model. They can wax eloquent about Log Likelihood Estimators but can have trouble explaining why you should use that in the first place. Then you have the banking quants (I used to be one of those), who are proficient in derivatives pricing, stochastic calculus and partial differential equations, but if you ask them why a stock price movement is generally assumed to be lognormal, they don’t have answers. Then you have the coders, who can hack, scrape and write really efficient code, but don’t know much math. And mathematicians who can come up with elegant solutions but who are divorced from reality.

While you might make a career out of falling under any of the above categories, to truly unleash your potential as a quant, you should be able to do all. You should be a fox and should know each of these  tricks. And unlike the fox in the Old Russian fairy tale, the key to being a good fox is to know what trick to use when. Let me illustrate this with an example from my work today (actual problem statement masked since it involves client information).

So there were two possible distributions that a particular data point could have come from and I had to try and analyze which of them it came from (simple Bayesian probability, you might think). However, calculating the probability wasn’t so straightforward, as it wasn’t a standard function. Then I figured I could solve the probability problem using the inclusion-exclusion principle (maths again), and wrote down a mathematical formulation for it.

Now, I was dealing with a rather large data set, so I would have to use the computer, so I turned my mathematical solution into pseudo-code. Then, I realized that the pseudo-code was recursive, and given the size of the problem I would soon run out of memory. I had to figure out a solution using dynamic programming. Then, following some more code optimization, I had the probability. And then I had to go back to do the Bayesian analysis in order to complete the solution. And then present the solution in the form of a “business solution”, with all the above mathematical jugglery being abstracted from the client.

This versatility can come in handy in other places, too. There was a problem for which I figured out that the appropriate solution involved building a classification tree. However, given the nature of the data at hand, none of the off-the-shelf classification tree algorithms for were ideal. So I simply went ahead and wrote my own code for creating such trees. Then, I figured that classification trees are in effect a greedy algorithm, and can lead to getting stuck at local optima. And so I put in a simulated annealing aspect to it.

While I may not have in depth knowledge of any of the above techniques (to gain breadth you have to sacrifice depth), that I’m aware of a wide variety of techniques means I can provide the solution that is best for the problem at hand. And as I go along, I hope to keep learning more and more techniques – even if I don’t use them, being aware of them will lead to better overall problem solving.

Why standard deviation is not a good measure of volatility

Most finance textbooks, at least the ones that are popular in Business Schools, use standard deviation as a measure of volatility of a stock price. In this post, we will examine why it is not a great idea. To put it in one line, the use of standard deviation loses information on the ordering of the price movement.

As earlier, let us look at two data sets and try to measure their volatility. Let us consider two time series (let’s simply call them “series1” and “series2”) and try and compare their volatilities. The table here shows the two series:

vol1 What can you say of the two series now? You think they are similar? You might notice that both contain the same set of numbers, but jumbled up.  Let us look at the volatility as expressed by standard deviation. Unsurprisingly, since both series contain the same set of numbers, the volatility of both series is identical – at 8.655.

However, does this mean that the two series are equally volatile? Not particularly, as you can see from this graph of the two series:

vol2

It is clear from the graph (if it was not clear from the table already) that Series 2 is much more volatile than series 1. So how can we measure it? Most textbooks on quantitative finance (as opposed to textbooks on finance) use “Quadratic Variation” as a measure of volatility. How do we measure quadratic variation?

If we have a series of numbers from a_1 to a_n , then the quadratic variation of this series is measured as

sum_{i=2 to n} (a_i - a_{i-1})^2

Notice that the primary difference feature of the quadratic variation is that it takes into account the sequence. So when you have something like series 2, with alternating positive and negative jumps, it gets captured in the quadratic variation. So what would be the quadratic variation values for the two time series we have here?

The QV of series 1 is 29 while that of series 2 is a whopping 6119, which is probably a fair indicator of their relative volatilities.

So why standard deviation?

Now you might ask why textbooks use standard deviation at all then, if it misses out so much of the variation. The answer, not surprisingly, lies in quantitative finance. When the price of a stock (X) is governed by a Wiener process, or

dX = sigma dW

then the quadratic variation of the stock price (between time 0 and time t) can be shown to be sigma^2 t , which for t = 1 is sigma^2 which is the variance of the process.

Because for a particular kind of process, which is commonly used to model stock price movement, the quadratic variation is equal to variance, variance is commonly used as a substitute for quadratic variation as a measure of volatility.

However, considering that in practice stock prices are seldom Brownian (either arithmetic or geometric), this equivalence doesn’t necessarily hold.

This is also a point that Benoit Mandelbrot makes in his excellent book The (mis)Behaviour of Markets. He calls this the Joseph effect (he uses the biblical story of Joseph, who dreamt of seven fat cows being eaten by seven lean foxes, and predicted that seven years of Nile floods would be followed by seven years of drought). Financial analysts, by using a simple variance (or standard deviation) to characterize volatility, miss out on such serial effects.