Hill Climbing in real life

Fifteen years back, I enrolled for a course on Artificial Intelligence as part of my B.Tech. programme at IIT Madras. It was well before stuff like “machine learning” and “data science” became big, and the course was mostly devoted to heuristics. Incidentally, that term, we had to pick between this course and one on Artificial Neural Networks (I guess nowadays that one is more popular given the hype about Deep Learning?), which meant that I didn’t learn about neural networks until last year or so.

A little googling tells me that Deepak Khemani, who taught us AI in 2002, has put up his lectures online, as part of the NPTEL programme. The first one is here:

In fact, the whole course is available here.

Anyways, one of the classes of problems we dealt with in the course was “search”. Basically, how does a computer “search” for the solution to a problem within a large “search space”?

One of the simplest heuristic is what has come to be known as “hill climbing” (too lazy to look through all of Khemani’s lectures and find where he’s spoken about this). I love computer science because a lot of computer scientists like to describe ideas in terms of intuitive metaphors. Hill climbing is definitely one such!

Let me explain it from the point of view of my weekend vacation in Edinburgh. One of my friends who had lived there a long time back recommended that I hike up this volcanic hill in the city called “Arthur’s Peak“.

On Saturday evening, I left my wife and daughter and wife’s parents (who I had travelled with) in our AirBnB and walked across town (some 3-4 km) to reach Holyrood Palace, from where Arthur’s Seat became visible. This is what I saw: 

Basically, what you see is the side of a hill, and if you see closely, there are people walking up the sides. So what you guess is that you need to make your way to the bottom of the hill and then just climb.

But then you make your way to the base of the hill and see several paths leading up. Which one do you take? You take the path that seems steepest, believing that’s the one that will take you to the top quickest. And so you take a step along that path. And then see which direction to go to climb up steepest. Take another step. Rinse. Repeat. Until you reach a point where you can no longer find a way up. Hopefully that’s the peak.

Most of the time, you are likely to end up on the top of a smaller rock. In any case, this is the hill climbing algorithm.

So back to my story. I reached the base of the hill and set off on the steepest marked path.

I puffed and panted, but I kept going. It was rather windy that day, and it was threatening to rain. I held my folded umbrella and camera tight, and went on. I got beautiful views of Edinburgh city, and captured some of them on camera. And after a while, I got tired, and decided to call my wife using Facetime.

In any case, it appeared that I had a long way to go, given the rocks that went upwards just to my left (I was using a modified version of hill climbing in that I used only marked paths. As I was to rediscover the following day, I have a fear of heights). And I told that to my wife. And then suddenly the climb got easier. And before I knew it I was descending. And soon enough I was at the bottom all over again!

And then I saw the peak. Basically what I had been climbing all along was not the main hill at all! It was a “side hill”, which I later learnt is called the “Salisbury Crags”. I got down to the middle of the two hills, and stared at the valley there. I realised that was a “saddle point”, and hungry and tired and not wanting to get soaked in rain, I made my way out, hailed a cab and went home.

I wasn’t done yet. Determined to climb the “real peak”, I returned the next morning. Again I walked all the way to the base of the hill, and started my climb at the saddle point. It was a tough climb – while there were rough steps in some places, in others there was none. I kept climbing a few steps at a time, taking short breaks.

One such break happened to be too long, though, and gave me enough time to look down and feel scared. For a long time now I’ve had a massive fear of heights. Panic hit. I was afraid of going too close to the edge and falling off the hill. I decided to play it safe and turn back.

I came down and walked across the valley you see in the last picture above. Energised, I had another go. From what was possibly a relatively easier direction. But I was too tired. And I had to get back to the apartment and check out that morning. So I gave up once again.

I still have unfinished business in Edinburgh!

 

Shorting private markets

This is one of those things I’ll file in the “why didn’t I think of it before?” category.

The basic idea is that if you think there is a startup bubble, and that private companies (as a class) are being overvalued by investors, there exists a rather simple way to short the market – basically start your own company and sell equity to these investors!

The basic problem with shorting a market such as those for shares of privately held startups is that the shares are owned by a small set of investors, none of whom are likely to lend you stock that you can sell and buy back later. More importantly, markets in privately held stock can be incredibly illiquid, and it may take a long time indeed before the stocks move to what you think is their “right” level.

So what do you do? I’ll simply let the always excellent Matt Levine to provide the answer here:

We have talked a few times in the past about the difficulty of shorting unicorns: Investors can buy shares in the big venture-backed private tech companies, but they can’t sell those shares short, which arguably leads to those shares being overvalued as enthusiasts join in but skeptics are excluded. As I once said, though, “the way to profit from a bubble is by selling into it, and that people sometimes focus too narrowly on short-selling into it”: If you think that unicorns as a category are overvalued, the way to profit from that is not so much by shorting Uber as it is by founding your own dumb startup, raising a lot of money from overenthusiastic venture capitalists, paying yourself a big salary, and walking away whistling when the bubble collapses.

Same here! If you are skeptical of the ICO trend, the right thing to do is not to short all the new tokens that are coming to market. It’s to build your own token, do an initial coin offering, and walk off with the proceeds. For the sake of your own conscience, you can just go ahead and say that that’s what you’re doing, right in the ICO white paper. No one seems to mind.

Seriously! Why didn’t I think of this?

Scott Adams, careers and correlation

I’ve written here earlier about how much I’ve been influenced by Scott Adams’s career advice about “being in top quartile of two or more things“.  To recap, this is what Adams wrote nearly ten years back:

If you want an average successful life, it doesn’t take much planning. Just stay out of trouble, go to school, and apply for jobs you might like. But if you want something extraordinary, you have two paths:

1. Become the best at one specific thing.
2. Become very good (top 25%) at two or more things.

The first strategy is difficult to the point of near impossibility. Few people will ever play in the NBA or make a platinum album. I don’t recommend anyone even try.

Having implemented this to various degrees of success over the last 5-6 years, I propose a small correction – basically to follow the second strategy that Adams has mentioned, you need to take correlation into account.

Basically there’s no joy in becoming very good (top 25%) at two or more correlated things. For example, if you think you’re in the top 25% in terms of “maths and physics” or “maths and computer science” there’s not so much joy because these are correlated skills. Lots of people who are very good at maths are also very good at physics or computer science. So there is nothing special in being very good at such a combination.

Why Adams succeeded was that he was very good at 2-3 things that are largely uncorrelated – drawing, telling jokes and understanding corporate politics are not very correlated to each other. So the combination of these three skills of his was rather unique to find, and their combination resulted in the wildly successful Dilbert.

So the key is this – in order to be wildly successful, you need to be very good (top 25%) at two or three things that are not positively correlated with each other (either orthogonal or negative correlation works). That ensures that if you can put them together, you can offer something that very few others can offer.

Then again, the problem there is that the market for this combination of skills will be highly illiquid – low supply means people who might demand such combinations would have adapted to make do with some easier to find substitute, so demand is lower, and so on. So in that sense, again, it’s a massive hit-or-miss!

Maths, machine learning, brute force and elegance

Back when I was at the International Maths Olympiad Training Camp in Mumbai in 1999, the biggest insult one could hurl at a peer was to describe the latter’s solution to a problem as being a “brute force solution”. Brute force solutions, which were often ungainly, laboured and unintuitive were supposed to be the last resort, to be used only if one were thoroughly unable to implement an “elegant solution” to the problem.

Mathematicians love and value elegance. While they might be comfortable with esoteric formulae and the Greek alphabet, they are always on the lookout for solutions that are, at least to the trained eye, intuitive to perceive and understand. Among other things, it is the belief that it is much easier to get an intuitive understanding for an elegant solution.

When all the parts of the solution seem to fit so well into each other, with no loose ends, it is far easier to accept the solution as being correct (even if you don’t understand it fully). Brute force solutions, on the other hand, inevitably leave loose ends and appreciating them can be a fairly massive task, even to trained mathematicians.

In the conventional view, though, non-mathematicians don’t have much fondness for elegance. A solution is a solution, and a problem solved is a problem solved.

With the coming of big data and increased computational power, however, the tables are getting turned. In this case, the more mathematical people, who are more likely to appreciate “machine learning” algorithms recommend “leaving it to the system” – to unleash the brute force of computational power at the problem so that the “best model” can be found, and later implemented.

And in this case, it is the “half-blood mathematicians” like me, who are aware of complex algorithms but are unsure of letting the system take over stuff end-to-end, who bat for elegance – to look at data, massage it, analyse it and then find that one simple method or transformation that can throw immense light on the problem, effectively solving it!

The world moves in strange ways.

Levels and shifts in analysing games

So Nitin Pai and Pranay Kotasthane have a great graphic on how India should react to China’s aggressions on Doka La. While the analysis is excellent, my discomfort is with the choice of “deltas” as the axes of this payoff diagram, rather than levels.

Source: Nitin Pai and Pranay Kotasthane

Instead, what might have been preferable would have been to define each countries strategies in terms of levels of aggressions, define their current levels of aggression, and evaluate the two countries’ strategies in terms of moving to each possible alternate level. Here is why.

The problem with using shifts (or “deltas” or “slopes” or whatever you call the movement between levels) is that they are not consistent. Putting it mathematically, the tangent doesn’t measure the rate of change in a curve when you go far away from the point where you’ve calibrated the tangent.

To illustrate, let’s use this diagram itself. The strategy is that India should “hold”. From the diagram, if India holds, China’s best option is to escalate. In the next iteration, India continues to hold, and China continues to escalate. After a few such steps, surely we will be far away enough from the current equilibrium that the payoff for changing stance is very different from what is represented by this diagram?

This graph is perhaps valid for the current situation where (say) India’s aggression level is at 2 on a 1–5 integer scale, while China is at 3. But will the payoffs of going up and down by a notch be the same if India is still at 2 and China has reached the maximum pre-war aggression of 5 (remember that both are nuclear powers)?

On the flip side, the good thing about using payoffs based on changes in level is that it keeps the payoff diagram small, and this is especially useful when the levels cannot be easily discretised or there are too many possible levels. Think of a 5×5 square graph, or even a 10×10, in place of the 3×3, for example?—?soon it can get rather unwieldy. That is possibly what led Nitin and Pranay to choose the delta graph.

Mirrored here.

Schoolkid fights, blockchain and smart contracts

So I’ve been trying to understand the whole blockchain thing better, since people nowadays seem to be wanting to use it for all kinds of contracts (even the investment bankers are taking interest, which suggests there’s some potential out there 😛 ).

One of the things I’ve been doing is to read this book (PDF) on Blockchain by Arvind Narayanan and co at Princeton. It’s an easy to read, yet comprehensive, take on bitcoin and cryptocurrency technologies, the maths behind it and so on.

And as I’ve been reading it, I’ve been developing my own oversimplified model of what blockchain and smart contracts are, and this is my take at explaining it.

Imagine that Alice and Bob are two schoolkids and they’ve entered into a contract which states that if Alice manages to climb a particular tree, Bob will give her a bar of chocolate. Alice duly climbs the tree and claims the chocolate, at which point Bob flatly denies that she climbed it and refuses to give her the chocolate. What is Alice to do?

In the conventional “contract world”, all that Alice can do is to take the contract that she and Bob had signed (assume they had formalised it) and take it to a court of law (a schoolteacher, perhaps, in this case), which will do its best possible in order to determine whether she actually climbed the tree, and then deliver the judgment.

As you may imagine, in the normal schoolkid world, going to a teacher for adjudicating on whether someone climbed a tree (most likely an “illegal” activity by school rules) is not the greatest way to resolve the fight. Instead, either Alice and Bob will try to resolve it by themselves, or call upon their classmates to do the same. This is where the blockchain comes in.

Simply put, in terms of the blockchain “register”, as long as more than half of Alice and Bob’s classmates agree that she climbed the tree, she is considered to have climbed the tree, and Bob will be liable to give her chocolate. In other words, the central “trusted third party” gets replaced by a decentralised crowd of third parties where the majority decision is taken to be the “truth”.

Smart contracts take this one step further. Bob will give the bar of chocolates to the collective trust of his classmates (the adjudicators). And if a majority of them agree that Alice did climb the tree, the chocolate will be automatically given to her. If not, it will come back to Bob. What blockchain technologies allow for is to write code in a clever manner so that this can get executed automatically.

This might be a gross oversimplification, but this is exactly how the blockchain works. Each transaction is considered “valid” and put into the blockchain if a majority of nodes agrees it’s valid. And in order to ensure that this voting doesn’t get rigged, the nodes (or judges) need to perform a difficult computational puzzle in order to be able to vote – this imposes an artificial cost of voting which makes sure that it’s not possible to rig the polls unless you can take over more than half the nodes – and in a global blockchain where you have a really large number of nodes, this is not feasible.

So when you see that someone is building a blockchain based solution for this or that, you might wonder whether it actually makes sense. All you need to do is to come back to this schoolkid problem – for the kind of dispute that is likely to arise from this problem, would the parties prefer to go to a mutually trusted third party, or leave it to the larger peer group to adjudicate? Using the blockchain is a solution if and only if the latter case is true.

Selling yourself for job and consulting

So for the first time in over eight years, I’m looking for a job. This was primarily prompted by my move to London earlier this year – a consulting business where you rely on networks rather than a global brand to get new business cannot be easily transplanted. Moreover, as I’d written a year back, a lot of the objectives of the “portfolio life” have been achieved, so I’m willing to let go of the optionality.

While writing a “Cover Letter” for a job application yesterday I realised what makes selling yourself for a job so much harder than selling yourself for a consulting assignment – in the former case, you need to also communicate a “larger purpose”.

For the last 5-6 years I’ve been mostly selling myself for consulting assignments, and while it hasn’t been easy, all I’ve needed to do to sell has been to convince the potential client that I’ll do a good job solving whatever problem they have, and that my fees is a worthy investment for them. And to some extent I’ve become better over the years making such arguments.

When you’re applying for a job, you not only have to convince the counterparty that you’ll be good at whatever you need to do, and that you are worth the salary that you are asking for, but also need to argue how the job will “improve your life”. You need to explain to them why the job fits in to the list of stuff you’ve already done in your life. You need to talk about where you see yourself 5/10/50 years from now. You need to actually express interest in the job, and irrespective of how mundane the job description, you need to act like it’s the most exciting job ever.

And this is a part I haven’t been good at, basically since I haven’t done any of it for a long time now. And in any case, this is a part of the cover letter that people routinely bluff about, so I don’t know if recruiters even take this part seriously. In any case, I’ve been filling most of my cover letters so far with explanations of how I’ll do an awesome job of the job, and keeping only a cursory line or two about “how the job will improve my life”!