Tuesday, December 12, 2017

AlphaZero vs Stockfish part 1

As we all probably know by now, AlphaZero has played against Stockfish 8 for 100 games. Both sides have one minute per move, probably due to AlphaZero lack of time management yet. AlphaZero did not lose a single game to Stockfish and won 28 games. This type of result is almost never heard off in computer chess tournaments, it shows how AlphaZero developed 'intuition' on which positions to calculate, instead of an extremely fine-tuned alpha-beta search. AlphaZero searches just 80 thousand positions per second, compared to 70 million for Stockfish. However, its deep neural network focuses much more selectively on the most promising variations, similar to how a human would calculate[1], thus AlphaZero's games may seem quite human-like in nature. In this article, I'll be showing the first 5 games that are made public[2].
Board 1: Piece activity

AlphaZero's play is quite human-like, probably due to the way it searches. It also likes to immobilize white's pawns - a theme to be repeated many times.
The sacrifice by stockfish seemed quite promising at first, but it was not able to attack as AlphaZero's pieces are much more well placed in the middlegame compared to Stockfish's pieces. The games will start to get more interesting from here.
Board 2: Light square weaknesses

This game is extremely Nimzovich-like, overprotecting squares. In this Ruy Lopez variation, white exchanges away the light square bishop for a knight, hoping to exploit the doubled pawns, seems like keeping the light square bishop would have been a better variation, instead of having pawns on e4 and c4 as light square targets.
AlphaZero is playing chess more positionally, and less engine-like, similar to how people used to play against engines, until computational power took over.
Board 3: Dark square weaknesses

A middlegame zugzuang, how often do you see that. AlphaZero is really good at it's positional games, exploiting a kingside fianchetto structure when black does not have its dark square bishop, and then 'trapping the queen' and completely paralyzing black, forcing black to lose material by zugzuang.
The sight of having all the major pieces just sitting on your sixth rank dark square weakness looks extremely intimidating, and then having your pieces paralyzed, an extremely torturous game for stockfish.
The pawn sacrifice will happen on another game, suggesting that sacrificing that pawn for activity is possibly dangerous for black playing QID. Black may be able to draw by returning the d pawn which the queen and rook were applying pressure on even after weakening the dark squares.
Board 4: Backwards pawns

AlphaGo is really good at exploiting weaknesses, only a bishop of the wrong color and a backwards pawn and black crumbles. Stockfish is constantly getting positionally outplayed, surprisingly it survives the endgame for quite some time. It is also nice to have a French Defense game where black loses positionally(really dislike playing against french).
Game 5: Misplaced pieces

Looks like the 'Kasparov pawn sac' in QID is working quite well, immobilizing the queenside. 8...c6 was probably a blunder as it allows white to attack the d6 square with both its knights and queen.
AlphaZero repeatedly moves the same piece many times in the opening to misplace black's pieces, instead of development, misplacing pieces seems to be of higher importance than development here.
The QID is constantly losing to pieces being inactive from the pawn sacrifice, either Kasparov refuted QID many years ago and now AlphaZero also found the refutation or black should prevent the gambit by 6...d5! instead of O-O. We'll have to see AlphaZero play against itself in QID to know with higher confidence.
AlphaZero playing style is similar to William Steinitz for the middlegame and Capablanca/Bobby Fisher for the endgame. It's opening choice aim to get a positional advantage to completely outplay positionally in the middlegame, forcing weaknesses.
Stockfish has been positionally destroyed in all of the games, most likely due to AlphaZero learning to play more positionally than just brute force calculation. Brute force calculation is vulnerable to the horizon effect, trying to use an engine to evaluate shows this effect quite nicely, it gives stockfish an advantage or rate it as a draw until a few minutes later where it gives AlphaZero an positive score. We may be seeing more positional games in grandmaster games soon.
In part 2, I'll be showing the other 5 games and hopefully more games will be out by then.
References so I don't get sued:
[1] arXiv:1712.01815v1 [cs.AI]
[2]Google's AlphaZero Destroys Stockfish In 100-Game Match

