AlphaZero defeats Stockfish: Quick thoughts

The big news of the day, as far as I’m concerned, is the victory of Google Deepmind’s AlphaZero over Stockfish, currently the highest rated chess engine. This comes barely months after Deepmind’s AlphaGo Zero had bested the earlier avatar of AlphaGo in the game of Go.

Like its Go version, the AlphaZero chess playing machine learnt using reinforcement learning (I remember doing a term paper on the concept back in 2003 but have mostly forgotten). Basically it wasn’t given any “training data”, but the machine trained itself on continuously playing with itself, with feedback given in each stage of learning helping it learn better.

After only about four hours of “training” (basically playing against itself and discovering moves), AlphaZero managed to record this victory in a 100-game match, winning 28 and losing none (the rest of the games were drawn).

There’s a sample game here on the Chess.com website and while this might be a biased sample (it’s likely that the AlphaZero engineers included the most spectacular games in their paper, from which this is taken), the way AlphaZero plays is vastly different from the way engines such as Stockfish have been playing.

I’m not that much of a chess expert (I “retired” from my playing career back in 1994), but the striking things for me from this game were

  • the move 7. d5 against the Queen’s Indian
  • The piece sacrifice a few moves later that was hard to see
  • AlphaZero’s consistent attempts until late in the game to avoid trading queens
  • The move Qh1 somewhere in the middle of the game

In a way (and being consistent with some of the themes of this blog), AlphaZero can be described as a “stud” chess machine, having taught itself to play based on feedback from games it’s already played (the way reinforcement learning broadly works is that actions that led to “good rewards” are incentivised in the next iteration, while those that led to “poor rewards” are penalised. The challenge in this case is to set up chess in a way that is conducive for a reinforcement learning system).

Engines such as StockFish, on the other hand, are absolute “fighters”. They get their “power” by brute force, by going down nearly all possible paths in the game several moves down. This is supplemented by analysis of millions of existing games of various levels which the engine “learns” from – among other things, it learns how to prune and prioritise the paths it searches on. StockFish is also fed a database of chess openings which it remembers and tries to play.

What is interesting is that AlphaZero has “discovered” some popular chess openings through the course of is self-learning. It is interesting to note that some popular openings such as the King’s Indian or French find little favour with this engine, while others such as the Queen’s Gambit or the Queen’s Indian find favour. This is a very interesting development in terms of opening theory itself.

Frequency of openings over time employed by AlphaZero in its “learning” phase. Image sourced from AlphaZero research paper.

In any case, my immediate concern from this development is how it will affect human chess. Over the last decade or two, engines such as stockfish have played a profound role in the development of chess, with current top players such as Magnus Carlsen or Sergey Karjakin having trained extensively with these engines.

The way top grandmasters play has seen a steady change in these years as they have ingested the ideas from engines such as StockFish. The game has become far more quiet and positional, as players seek to gain small advantages which steadily improves over the course of (long) games. This is consistent with the way the engines that players learn from play.

Based on the evidence of the one game I’ve seen of AlphaZero, it plays very differently from the existing engines. Based on this, it will be interesting to see how human players who train with AlphaZero based engines (or their clones) will change their game.

Maybe chess will turn back to being a bit more tactical than it’s been in the last decade? It’s hard to say right now!

Curation, editing and predictability

One of my favourite lunchtime hobbies over the last one year has been watching chess videos. My favourite publishers in this regard are GM Daniel King and Mato Jelic. King is a far superior analyst and goes into more depth while analysing games, though Jelic has a far larger repertoire (King usually only analyses games the day they were played).

In some ways I might be biased towards Jelic because his analysis and focus are largely in line with my strengths back during my days as a competitive chess player. Deep opening analysis, attacking games, the occasional tactical flourish and so on. He has a particular fondness for the games of Mikhail Tal, showering praises on his (Tal’s) sometimes erratic and seemingly purposeless sacrifices.

Once you watch a few videos of Jelic, though, you realise that there is a formula to his commentary. At some point in the game, he announces that the game is in a “critical position” and asks the viewer to pause the video and guess the next move. And a few seconds of pause later, he proceeds to show the move and move on with the game.

While this is an interesting exercise the first few times around, after a few times I started seeing a pattern – Jelic has a penchant for attacking positions, and the moves following his “critical positions” are more often than not sacrifices. And once I figured this bit out, I started explicitly looking for sacrifices or tactical combination every time he asked me to pause, and that has made the exercise a lot less fun.

I’d mentioned on this blog a few weeks back about my problem with watching movies – in that I’m constantly trying to second-guess the rest of the movie based on the information provided thus far. And when a movie gets too predictable, it tends to lose my attention. And thinking about it, I think sometimes it’s about curation or editing that makes things too predictable.

To take an example, my wife and I have been watching Masterchef Australia this year (no spoilers, please!), and I remarked to her the other day that episodes have been too predictable – at the end of every contest, it seems rather easy to predict who might win or go down, and so there has been little element of surprise in the show.

My wife remarked that this was not due to the nature of the competition itself (which she said is as good as earlier editions), but due to the poor editing of the show – during each competition, there is a disproportional amount of time dedicated to showing the spectacularly good and spectacularly bad performances.

Consequently, just this information – on who the show’s editors have chosen to focus on for the particular episode – conveys a sufficient amount of information on each person’s performance, without even seeing what they’ve made! A more equitable distribution of footage across competitors, on the other hand, would do a better job of keeping the viewers guessing!

It is similar in the case of Jelic’s videos. There is a pattern to the game situation where he pauses, which biases the viewer in terms of guessing what the next move will be. In order to make the experience superior for his viewers, Jelic should mix it up a bit, occasionally showing slow Carlsen-like positions, and stopping games at positional “critical positions”, for example. That can make the pauses more interesting, and improve viewer experience!

What are other situations where bad editing effectively gives away the plot, and diminishes the experience?

The importance of queen side counterplay

Back in 1994 when I was still playing competitive chess (I practically retired in a year’s time after a series of blunders under pressure), I had played in this one special tournament that was played to “prepare Karnataka youngsters for national events”. Though I wasn’t travelling to any of these events, being a “promising youngster” I had received an invitation to play.

It was a weird kind of tournament, for apart from us “youngsters”, there were these senior players from the state who participated in the tournament on and off. Their scores weren’t tallied – all they did was to make sure each “youngster” played an equal number of games against a “senior player” and only youngsters’ scores counted.

In the first round of the tournament, I faced off against a senior named Nagesh (if I remember correctly). Nagesh played white and played a King’s Indian Attack against my Sicilian Defence (part of this special tournament was to expose us to non-standard openings and plays). It was a hard fought middle and end game where experience ultimately prevailed, and I lost.

In the analysis after the game, Nagesh pointed out that while he had an established centre and strong kind side attack, I had managed to build up a fairly expansive position on the queen side, and that I should have “pushed harder on the queenside for counterplay” rather than simply defending. While I took his point, I didn’t see the point of expanding on the queen side to grab a couple of pawns and (with a remote chance) threaten to queen one of my pawns there when my king was under heavy attack.

This bewilderment continued through the next year, as I studied openings for which the stated strategy was to “get counterplay on the queenside”. Not being a particularly great endgame player (though I did show some promise in that in my brief career), the advantage that could be gained by the gain of a pawn was lost to me, and I would prefer to go for a more tactical game (which usually didn’t go too well).

As an adult, while I don’t play competitively any more, I continue to follow chess and watch videos from time to time for entertainment. I’ve developed more nuance on strategy, and in playing a positional game. I’ve seen how small advantages (like space, or even a pawn) can be turned into decisive victories, and given myself shit for not learning to play endgames better back during my playing career. It’s a more holistic view of chess than the one I had formed as a schoolboy having mugged up all the moves of Morphy’s 17-move win against the Duke of Brunswick and Count Isouard (I still remember that game by heart).

Though it doesn’t take much convincing now for me to appreciate the joys of positional play, and going for queen side counterplay when your king is under attack, I found the game played by Viswanathan Anand against Veselin Topalov in the first round of the ongoing Candidates tournament rather interesting.

The two players go for different strategies – while Topalov builds up for an attack against Anand’s king, Anand goes for queen side counterplay (the bit I didn’t get back when I was a young player) and goes pawn grabbing. It was a rather complex game and both players played rather inaccurately under time pressure, but it is an excellent example of how queen side counterplay can help defuse an attack.

Anand’s queen nearly gets trapped (in the press conference after the game, he said he was reconciled to giving it up if attacked). There is a massive piledriver of pieces Topalov stacks up on the king side to attack Anand’s king. There is absolutely no threat of danger on Topalov’s king.

Yet, from time to time, Anand’s pawn grabbing strategy means Topalov has to move back some pieces to the queen side for its defence, blunting the attack. Then, Topalov needs to recover lost material, and moves his rooks to the queen side for that purpose. There is a mad scramble around the time control (both players got into time trouble) when the position gets liquidated with a lot of pieces exchanged.

After the dust settles, we find that Topalov’s remaining pieces are horribly misplaced on the queenside (on a pawn recovery campaign), while Anand’s are now trained towards an attack on Topalov’s king. As Topalov scrambles to defuse this attack, he loses material, and ultimately resigns.

It was a fascinating game to get a potentially fascinating tournament underway. I hope to follow it as best as I can, though that might not be so trivial given the holiday I’m taking later this month. Watching GM Daniel King’s analysis of Anand’s game (linked above) started making me wonder if I’d have played differently had I had access to such high quality commentary when I was still a competitive player two decades ago.

As for that tournament, I ended up beating the other senior player I played against. He blundered his queen in a typical tactical Sicilian Dragon Yugoslav Attack position (I was white). I placed second among all the “youngsters” there, and got my only prize money from chess after that game – a princely Rs. 80 (which wasn’t that bad for a schoolboy in 1994)!

How computers have changed chess

Prior to computers, limited depth of analysis meant chess strategies were “calibrated to model”. Now they’re calibrated to actual results and that results in better strategies (unconstrained by aesthetics)

With the chess Candidates tournament starting in Moscow today (to decide World Champion Magnus Carlsen’s next challenger), I’ve been watching a few chess videos of late, and participating in discussions on why Anand has been finding it hard to play of late.

One thing that people have widely agreed is that computers have changed the way chess is played, and the “new generation” (Carlsen, Hikaru Nakamura, Fabiano Caruana, Anish Giri, etc.) have learnt the game in a completely different way from the old-timers, which dictates the way they play.

For example, these new guys play the kind of positions that earlier generations wouldn’t dream of playing. Given a position and a bunch of moves that seem similarly strong, the moves the new generation picks is different from what an older player would pick. And computer analysis is credited with this.

The basic advantage with computer analysis is that positions can now be evaluated easily to a much larger “depth” (number of moves from current position) compared to earlier manual analysis. In the manual analysis, you could evaluate the position for a few moves after which you would reach a position that you would judge manually. Judging different possible continuations this way, you would evaluate a position and figure what was a good continuation.

The problem with limited depth search was that after a certain depth, you simply had to use your judgment on what was a good position, and this judgment (the “boundary condition” that went into your model) would have a profound effect on how you evaluated different moves. Over time, all you cared about was the aesthetics of the chessboard, and not really how you could translate the position to victory (or a draw).

In other words, in the days before computers, chess players were building their strategies by calibrating them to a model rather than by calibrating them to actual results on the board. And this resulted in a bias towards “pretty strategies” and those that gave advantages that were obvious.

With computers, however, there is no such constraint on the depth of ply. You can analyse the position to far greater depth and get really close to the result in the course of your analysis. And so you don’t really care about the aesthetics of the positions you reach, as long as you know how they can translate to the result you want.

So the “new generation”, which has always been trained using computers, see the game differently. People of Anand’s generation (there’s also Veselin Topalov and Levon Aronian in the ongoing Candidates tournament) learnt the game with classic aesthetics and optimise their play to get there. Carlsen’s generation has no such biases and they play to what is the actual advantage irrespective of aesthetics.

And that’s how the battle is building up! This should be an interesting tournament!

Volleyball

It’s been over eight years since I last played the game, but if I were to pick one outdoor game in which I’m best at (relative to other games I’ve played) it’s volleyball. And when I say I’m best at that, it’s on a strict relative basis – in undergrad, I struggled to get into my hostel team (let alone college team). It just goes to show how bad I’ve been in other outdoor games! I’m a successful cricket and football-watcher, though!

The thing with volleyball is that my game runs counter to how i play other games, and my life in general. In general, I’m an extremely high-risk person – I’m not into adventure sports, though, but have a Royal Enfield motorcycle – I take chances where possible and go for the spectacular. It is hard for me to be “accurate” and “correct”, and given that I know that I’m prone to making mistakes I try to maximize the outputs from the times when I don’t make mistakes, and thus go on a high risk path.

So I’ve quit my job without something else in hand four times, now freelance as a management consultant, blog about every damn thing – things that have promises of big upsides, but also risks of downsides. It also reflects in how I sometimes talk to people – I sometimes try too hard to make an impression – which can potentially get me big returns, but end up saying something stupid at times, and end up sounding arrogant at other times. Those are risks I willingly take.

And this risky nature has reflected in most games I’ve played, also – again nothing in the recent past. In chess, I get bored of slow technical Carlsen-esque positions, and am prone to go on Morphy-esque attacks that can backfire spectacularly. Playing bridge, I finesse way more than I’m supposed to – making some otherwise unmakeable contracts, but going down in contracts I should have otherwise made.

Back in school, when we played cricket with rubber and tennis balls, I would bowl leg spin, and using a light bat, would try to hit every ball for four or six, rather than trying to bat steadily. And while playing basketball (my “second best” outdoor game, after volleyball) I have a propensity to go for long shots.

What sets volleyball apart is that my game completely runs counter to who I am. In volleyball I’m a solid player – don’t spike too much (can’t jump!!), but can set spikes well, block well and can lead a team well from the back line. In fact, my best volleyball games have been those when the team has had to carry some weak links, and I’ve led from the centre of the back line, lending solidity and helping build up attacks. It definitely doesn’t reflect what I’m like otherwise.

But volleyball has also been the game where I’ve had a large number of spectacular failures. At every level I’ve played, I’ve had some responsibility thrust upon me, and I’ve buckled under the pressure. It’s volleyball that comes to mind every time I let down people’s trust because I do badly a something I’m supposed to be good at.

1. Voyagers versus pioneers, 1999: This was the school inter-house tournament. We go two sets up. They win the next two. Down to the decider. We lead 14-13, and its our turn to serve. Our captain purposely messes up our rotation such that I can serve (I had a big serve – one attacking aspect of my volleyball). The serve clips the net on its way across (back then, a let was a foul serve in volleyball). We lose.

2. NPS Indiranagar versus NPS Rajajinagar, 1999: Then I get selected to represent my school. I’m on the bench, and am subbed in right on time to serve. I decide to warm up with an underarm serve (before I start unleashing my overarm thunders). Hit it into the net. Opponent’s serve comes to me and I receive it badly. Get subbed out.

3. G block versus F block, 2004-05: Semi finals of the IIMB inter-hostel championship. We have two big spikers, two decent lifters and defenders (including me) and two who had never played volleyball in their lives, but were chosen on the basis of their physical fitness alone. Down to third set (best of three). We lead 25-24 (new scoring system). I’m playing right forward. Ball comes across the net. All I need to do is to set it up for a big spike, but I decide to spike it directly myself. And miss. Then I serve on the next match point. Decide to go for a safe serve, gets returned. We lose.

4. Section C versus Section A, 2004-05: Again similar story. I don’t remember the specifics of this, but again it was heartbreak, and I think I missed my serve on match point.

I guess you get the drift..

Non competitive hobbies

During my riding trip two months back, I was wondering why I enjoyed riding so much more than any of the other “hobbies” that I have indulged in over the last twenty years or so. It was tough for me to think about any other hobby that had given me as much pleasure in the early days as riding did, and no other hobby seems or seemed as sustainable as this one. As I rode, and daydreamed while I rode, I thought about what it was about riding that gave me the kind of unbridled joy that any of my other hobbies had failed to provide. The reason, I figured, was that it was not competitive (no I don’t intend to be a motorcycle racer, ever).

Looking back at the hobbies that I’ve had since childhood – be it playing chess or playing the violin or even writing, they have all been competitive hobbies. As soon as I got reasonably good at chess, I started playing competitively, and soon the pressures of tournament play got to me, I lost my love for the game and stopped playing. Violin was a little better off in the sense that for a reasonably long time I only played for myself (apart from the occasions when I had to entertain random visiting relatives). But then, I was asked to take up an examination, and then enter inter-school music contests, and I find it no surprise that I quit my lessons six months after my examinations. I must mention that I’m on the road to committing the same mistake again, in my second stint at violin learning. As things stand now, I’m scheduled to appear for the ABRSM Grade Three examination this October, but I have my reasons for that and don’t think the process of appearing for the exam will kill my love for music.

Writing remained a passion, and a hobby which I think I was rather good at, until the time I started thinking about monetization. The minute I started thinking about wanting to write for money, I lost the love for it, which might explain the deceleration in activity on this site over the last three years or so. I had lost yet another hobby to the competitive forces.

The thing with competition is that it puts pressure on you. You have to being to hold yourself to a standard other than your own, and that means you will have to do certain things irrespective of whether you think it makes sense to do that. Soon, your hobby ends up as a slave to your competition, and it is unlikely you’ll be able to sustain interest after that. You can say that the moment a hobby becomes competitive, it ceases to be a hobby and becomes “work”.

The reason I’m bullish about motorcycling at this moment is that I don’t see a means for it to become competitive. Since I don’t intend to race, and don’t care about whether others have ridden more than me or whatever, I’ll be mostly riding for myself. Yes, when I planned my Rajasthan tour, I did think of monetizing it by writing about it for the media, but that I think was more a function of wanting to monetize my writing than my riding. In the event, i didn’t get a mandate to write, and that in no way affected my enthusiasm for the ride. Rather I felt freer that I could enjoy the ride rather than thinking about what I would write about it.

As I go along, I hope to pick up one or two more such non-competitive hobbies. Of course I intend to make motorcycling a “major” hobby. As it is, I love traveling, doing it my own way and going off the beaten path. And I love the feeling as i accelerate, with the wind penetrating the air vents of my riding jacket and my thighs grabbing the petrol tank. Now if only I can convince Pinky to also take this up as a hobby..

Charades of obscurity

Having “played” dumb-charades (DC for short) competitively at a school and college level, I don’t particularly enjoy playing it casually. I’m prone to getting annoyed when people around me (either on a picnic, or a party) exclaim with great enthusiasm that we should play DC. Till recently I used to think it was like chess – where my enthusiasm for the game has been killed purely because I played it competitively, but now I realize there are more reasons.

The challenge with “competitive” DC is that it is a timed game. You are judged based on how fast you can act out a certain name/place/animal/thing/. Because of this the clues need not be too hard, and there is a fair degree of challenge in acting out even simple things. Apart from this, the clues are set by a neutral third party which means they can all be trusted to be of approximately similar standard, so there is some sort of a level playing field there. Then, you have teams that have practiced well together, and have clues for all the trivial stuff, and you have a game!

With casual DC, there are several problems. Firstly, the games are not timed. Secondly, the teams haven’t practiced together at all, so it takes ages to communicate even straightforward stuff (which is why the games aren’t timed). And then the clues are usually given to you by your competitor. And for some reason, casual DC always has to be movies. No books, no places, no animals, no personalities, nothing.

The f act that the games are not timed, combined with the fact that the clues are given by the competitor, means that the game usually gets into a downward spiral of obscurity. You don’t want your competitor to guess the movie easily, so you give a vague movie. And they reply with something vaguer. And so forth, until teams have to check IMDB to find out if the movies actually exist. By which time all the enthusiasm for the game is lost.

On a recent trip (with colleagues, as part of our CSR initiative. more on that in another post) we played casual DC, and after some 10 clues it had gotten so obscure that nothing was guessable. I’d lost interest when someone suggested we do Kannada movies! Now, that’s something few people would’ve played – DC with Kannada movies as clues, because of which we could give clues while not keeping them too obscure (but it was hard. I completely bulbed trying to act out “Kalasipalya”).

Still, my hatred for casual DC remains, and I try as much as possible to not play it. Maybe next time I’ll impose conditions (like timing, choice of subjects, etc.), and refuse to play if they want to do English movies with infinite time.