Monday, March 9, 2015

The Freestyle Battle 2014

Computer-based Chess with Houdini & Co.
by Arno Nickel - (Read in German)
Since 4th February the Freestyle Battle 2014arguably the strongest computer chess tournament of all times, is played on the InfinityChess server. 30 invited players from the field of Freestyle Chess, pure computer chess and correspondence chess play a round robin tournament with a long game time limit (90m+15s). The participants can choose freely which tools they want to use for help, on which hardware they play, and which chess programs they use. The overall prize fund amounts to 20.000 US-dollar, with a first prize of 5,500 US-dollar. After 18 from 29 rounds the English player Anson Williams (nickname "Intagrand") leads by tie-break, followed by the Philippine player Alvin Alcala (nickname "Maximus") with the same number of points.

29 rounds all play all
The tournament mode evokes memories of times long gone when classical tournament chess still indulged in the luxury of gigantic round-robin tournaments. Karlsbad 1911 had 26 participants, and in New York 1889 20 participants played a double round-robin tournament, lasting for more than two months. Of course, an online-tournament such as the Freestyle Battle 2014 cannot be compared with such events. There is no rent to pay for the playing venue and no costs for hotels occur, the players continue to work in their regular jobs or do whatever they do and only meet on Tuesdays, Thursdays, and Saturdays for the games. For a player from St. Petersburg it might be rather late, 11pm, while the Big Ben in London just strikes 7pm, and the chess player in New York just had lunch at 2pm in the afternoon. However, despite the fantastic opportunities of online chess, not everyone wants or can commit to participate in a gigantic tournament with 29 playing days. Some might even be discouraged, which is why this format, which the tournament sponsor from Abu Dhabi wished for, will certainly not be the mode of choice for official qualification tournaments. However, let’s not get too much ahead of ourselves, and look at other things first. What, in fact, is Freestyle Chess about and how is such a tournament organized?
The soul of Freestyle Chess
One of the most exciting questions concerning computer chess is: can humans still add something of value to the deep calculations of the machine or even counter it? Or, relating to Freestyle Chess: is the computer/human team, the so-called "centaur", in a match really better than a computer operating on its own?
A few years ago, between 2005-2008, during the PAL/CSS-Freestyle-Tournaments on the ChessBase server, which offered prizes, this question was hotly debated, and answered with a "yes, but...". The results were in favor of the centaurs despite some occasional spectacular success of the machines. Here preparation played a significant role, because the superiority of the centaurs was more marked in round robin tournaments than in open tournaments with short term pairings. Specific opening choices, time management, structural knowledge, positional feeling, and deep analysis of critical variations in advance (going into the variations) were cornerstones of the centaur-strategy, even though one had to concede that the computers achieved a relatively high number of draws, particularly so when playing with White.

Logo of the PAL/CSS Tournament

Today, at a time when computer developments are rapidly taking place, and a new generation of chess engine has changed the chess world, the question of the role humans play in this battle can no longer be answered that clearly. A lot of chess commentary and video-livestreams at tournaments, in which engines (mostlyHoudini) run parallel to the games, sometimes create the impression that the chess engines know it all. What, then, can a human do? However, in reality things look rather different as everybody knows who has ever tried to analyze positions, in which several candidate moves of apparently rather equal value are possible, with the help of a computer knows.

Similar to correspondence chess but with a completely different time frame, Freestyle is about exploring the intricacies of a position deeper than a tournament player can do this over the board. Most of the time an engine can do this on its own, but depending on the position, the hardware the engine is running on, the opening book the engine has, and how its parameters are tuned, again and again there will be situations, in which a human can influence the analytical process profitably, namely as follows:
- Regulating the time factor
- Deepened analysis of critical variations
- Choosing the mode of analysis (single or multiple variation mode)
- Comparing different programs simultaneously
- Choosing between engine moves of equal value
- Applying structural chess knowledge.
All these points play an important role in a match of well equipped freestylers and may result in highly interesting games, which please the eye of the beholder because of their unique perfection. However, whether one succeeds in this respect is always an open question because not every opening variation leads to interesting positions and not every human interference regulating computer analysis is useful or crowned with success. However, such matches are definitely exciting, not least because of the many imponderables, which include technical factors, such as, for example, engine bugs and time trouble.

Freestyle Chess Champion Anson Williams

Standings (after 18 from 29 rounds)
1. Anson Williams                  (Intagrand / England)                               13 (106,5)
2. Alvin Alcala                       (Maximus / Philippines)                            13 (99,75)
3. Uwe Märtens                     (Regina-H.Milch / Germany)                    12
4. Patrik Schoupal                  (EtaoinShrdlu / Czech Republic)              11,5 (92,5)
5. Roland del Rio                  (Thomas_A_Anderson / Germany)           11,5 (88,25)
6. Mark Sabu                         (Deepthroat / USA)                                  11 (92)
7. Werner Bergmans              (MIG29 / The Netherlands)                      11 (90)
9. Dr. Michael Glatthaar        (Donkasand / South Africa)                      10,5 (98)
9. Juan Molina                       (Ozymandias / Spain)                                10,5 (97,25)
10. Igor Dolgov                     (Idol99 / Russia)                                       10,5 (89)
11. Bojan Fajs                        (Jobboy / Slovenia)                                   10,5 (83)

The complete table can be found here (see "standings").

The first four in the table have previously been very successful in Freestyle but most of the other players are also known in the Freestyle scene. New are Roland del Rio and Bojan Fajs, who, just like Igor Dolgov, are active correspondence players and title holders.

Anson Williams became the 7th Freestyle Champion at the PAL/CSS tournament 2007 on the ChessBase server. Some months later he won the Advanced Chess tournament in Benidorm 2007.
Alvin Alcala won the Freestyle Cup on the FICGS Server in 2011 and 2013.
Uwe Märtens was second at the 8th. PAL/CSS-Freestyle tournament 2008.
Patrik Schoupal was third at the 5th. PAL/CSS-Freestyle tournament 2007.

Centaurs against pure engines
In the Freestyle Battle 2014 the participants every round can choose whether they want to play as centaurs, which technically means having to enter the moves manually, or whether they let any UCI-engine play automatically (of course with a specifically prepared opening book). 16 of the 30 participants always play as centaurs; another 9 play mainly as centaurs, but in a few cases (when they had other obligations) employed an engine; 3 computer players occasionally tried their luck as centaurs, and only 2 players relied exclusively on the engines. Roughly speaking, 83% of the field are centaurs and 17% pure engine players. The engine players thus more often than not play against centaurs, and a third of all games is played between these two groups. Only in 10 from 265 games did computer programs play against each other.
In the competition between centaurs and pure engines, which, however, does not affect the distribution of prizes, the centaurs lead 53,5 to 42,5 after 18 rounds: +24 / =59 / -13, which on average is one point in every ten games (5.5:4.5). In 54 of these 96 games the centaurs played with White, in 42 games they played with Black, and thus they had a certain advantage resulting from the random sequence of the games. This, however, might later be leveled. But the distribution of color is an important factor because the superiority of the centaurs strongly relies on the white pieces – of the 24 wins the centaurs scored against the engines 20 were achieved with White. The engines score 9 wins with White compared to 4 wins with Black. This means that according to the current trend centaurs have a 65 % winning chance with White, but only a 45 % winning chance with Black. This allows one to conclude that the advantage of the centaurs lies mainly in the exploitation of opening advantages, but that it almost vanishes if no opening advantage is gained.

Most popular engines: Stockfish, Houdini and Komodo
What kind of programs are these three and to which extent are they used? Here it is interesting to look at a survey, for which the participants were asked before and during the tournament about the programs they use. 21 of the 30 participants said that they were using the following engines, either for analyzing or for the games:Stockfish 18, Houdini 14, Komodo 12, others 5. As far as the overall favorite is concerned, Stockfish also fares better than Houdini, though not all participants have just one single favorite. Komodo, however, is obviously almost exclusively used as a third option. A typical reasoning for this choice goes: Stockfish because of the depth of calculation and because it is presumably the tactically strongest engine, Houdini as the most balanced and positionally reliably engine, and Komodo as the engine with the most endgame knowledge.
In a number of different versions Stockfish is also definitely the engine most often used as pure engine, because 48 of the 96 tournament games against centaurs were played by the open source engine. This led to 8 wins, 14 losses and 26 draws. Houdini 4 was used 13 times: +0 / =8 / -5, Komodo 5 times: +1 / =3 / -1.

"Cryptic" most successful against centaurs
However, the second largest part of games against centaurs – 30 of 96 tournament games – was played by a private engine called Deep Cryptic Cyclone 2.7 or Cryptic Heroes 1.1, which is consistently used by two participants from Abu Dhabi on pretty strong hardware. The program is a privately sponsored engine by Zorchamp, the second Freestyle Champion, who had won this title in 2006, and who, after having been absent for a couple of years, again takes part in competitions, this year in the Freestyle Battle 2014. Like Houdini and Stockfish this engine, the developers of which are not known by name, seem to tie in to earlier versions of Rybka and Fruit,and with +4 / =22 / -4 the engine has an equal score against centaurs. In pure engine tournaments on the InfinityChess server Cryptic regularly finishes in the top group, which is probably due to the superior hardware of two xeon processors with 20 or 16 cores respectively. If the used opening books are improved the strengths of engine and hardware presumably will be even more pronounced.

Centaur statue of 1892 in the Tuileries Gardens, Paris
(Photo by Marie-Lan Nguyen)

Strengths and weaknesses of centaurs and pure engines
In regard to the reasons for wins and losses in the battle between centaurs and engines the following becomes apparent:
1. Centaurs are more flexible and better in selecting promising middle and endgame strategies, the choice of which often begins in the opening.
2. Centaurs can discover successful paths of analysis discursively by using different engines, which complement but sometimes also contradict each other. (This phenomenon was noticed before but pushed a bit into the background in the heydays of Rybka when this engine had been dominating the other engines from 2007 to 2010.)
3. If centaurs know the opposing engine (even if only because of the evaluations published about it) they can specifically try to exploit its supposed weaknesses.
4. Centaurs can get a hardware advantage by using several computer systems – if available.
5. Centaurs can structure the analytical more effectively by sharing the workload among several helpers. However, help teams are not easy to organize. The internal coordination costs time and requires a harmonic, well attuned team.
6. Centaurs on average need more time for handling the processes of analyzing and operating the engine, which may turn out to be a disadvantage and a risk.
7. Centaurs are prone to human errors and make mistakes (such as operating mistakes, misinterpretations etc).
8. Engines operating on their own do not lose the time a user loses, and thus the engines emerge from the opening book with a time plus because of the 15 second increment for each move played. If the book is very good and efficient this can be a strong factor in the further cause of the game.
9. Engines operating on their own use the opponents’ thinking time fully to continue analyzing and to anticipate possible or forced replies (so called "ponder"). If the expected move is played they may answer immediately with a move they have already analyzed deeply.
10. Depending on the position on the board, engines operating on their own can profit from a higher degree of objectivity because their analyzing processes run uninterrupted, that is, without human interference, which sometimes allows to find key moves, which the centaurs have either underestimated or even overlooked. Here the hardware can become the crucial factor.

In general, a superior hardware is always an advantage, particularly so when pure depth of calculation is required in critical positions, and, even more so, in time-trouble,. However, it became apparent that many positions (particularly in the endgame), which previously had not been analyzed properly when the engines were short of time and running on slower hardware, today are handled by standard quad-computers without errors (even more so if connected to tablebases), which means that the superior hardware of the opponent cannot strike as easily as it could before. This indicates that the technical progress – in regard to the mutual chances in Freestyle Chess – tends to have a leveling influence, which shows in the increasing number of draws (about 67%) in this tournament. With 59% percent the drawing percentage is slightly lower in the unequal battle of centaurs against engines, but in the battle of centaurs against each other there currently is a drawing percentage of 72,5%, which in a field of 30 participants indicates that the players are of rather equal strength.
In 2007/2008 an Intel-Quadcore of the Q6600 type (with 2 core-2-duo-cores) used to be higher standard in Freestyle Chess, but it was soon replaced by the new processor type of the Core-i-Series. Today, the i7-core with one processor and 4 or 6 cores is the most frequent processor in Freestyle. The participants of the Freestyle Battle 2014, who usually have two PCs running for analysis, use about 18 Quadcores (among them 10 of the i7-Series) plus nine 6-Core-PCs, also belonging to the i7-Series. Previously, systems with even more processor cores were an absolute exception, now, according to the participants, five 8-Core-machines (AMD and Xeon), three 12-Core-machines, six 16-core-machines and about six 24-core-machines are used in the Freestyle Battle. The absolute highlight seems to be Zorchamp’s machine with 2 Xeon processors with 20 cores each.

A gross case, in which a centaur missed a win despite great calculating power, is shown in the following position:

  Screenshot of the centaur game Takker vs. Erdo, round 9, February 22nd, 2014

Black to move here played 34...Ra4, and after queen and rook were exchanged on g7 with 35.Qxg7+ Qxg7 36.Rxg7 Kxg7 and further simplification by 37.Nxc7 Kf6 38.Ne6 Bxe6 39.dxe6 Rxa3 40.Rxa3 an equal rook ending resulted. Instead 34...Qh4+!? continuing the kingside attack and avoiding simplification was an alternative deserving attention, which, however, like the text move was evaluated by a number of different engines with a rather low value or a value of 0.00, that is, as a draw. Here Erdogan Günes ("Erdo"), who played with Black, had too much faith in the engines and focused on the endgame arising after 34...Ra4.
Curiously enough it was his opponent, William Fuller ("Takker"), who recently had become Correspondence IM, who during the three-and-a-half minutes Black spent on his move saw that Komodo analyzed 34…Qh4+!? and noticed the strength of this move, which he indicated after the game. Subsequent tests show that both Stockfish DD and Houdini 4 until depth 30 are relatively clueless that Black is winning after 34...Qh4+ 35.Ke2 Rg3 36.Nxc7, because the winning continuation, based on a rook sacrifice which only pays off about ten moves later, is at first not considered deeply enough because of evaluation reasons. It took Houdini 4 Pro x64B on a i7-960-Quad-machine 80 minutes in single variation mode before it first showed 34...Qh4+ with a minimal positive tendency (-0.07/depth 32), that is, with the first indicator that this perhaps might not be a drawn position. After 100 minutes the evaluation had risen to -0.55 (depth 32). Here the test was aborted.Stockfish DD was significantly faster to catch on, and on the same machine needed three minutes in depth 35 to find the winning move.
If you enter the variation 34...Qh4+ with the obvious moves 35.Ke2 Rg3 36.Nxc7 and do not interrupt Stockfish DDs happy calculations it takes only about half a minute until the engine indicates 36...h2!! (the quiet rook sacrifice on a8) as a probably winning line (-1.14 in depth 29), and half a minute later the evaluation already rises to -2.18 with a relatively stable variation, which is already rather watertight: 36...h2!! 37.Rxg3 fxg3 38.Nxa8 Kf7! 39.Qe1 Bh3, and no matter what White is doing now, the two passed pawns of Black will finish him off, for example: 40.Nc7 Bg2 41.Ne6 Bxh1 42.Qxh1 Qh6! (preventing counterplay) 43.Kf1 Qh3+ 44.Ke2 g2 45.Qc1 Qh4 and Black has his sights on his second queen.
For the sake of completeness is should be mentioned that the Komodo engine (version 6 tcecr) on the quad-machine used for the test, in depth 29 and after about seven minutes, first indicates the winning move 34...Qh4+ and thus in this case is tactically more aware than Houdini 4. However, this is even more true for the Rybka 4 engine, which today is not as fashionable as it used to be and which after only 5 minutes (in depth 21) indicates the right way with the quiet rook sacrifice 26...h2. In our series of tests Deep Fritz 14 also changed track after 11 minutes in depth 26 and favored 34...Qh4+ instead of 34...Ra4. The engines might even find the right move quicker if the hash tables are already filled with lots of variations and evaluations, which will certainly have been the case with William Fuller alias Takker.
Grandmaster Christopher Lutz showed the most plausible lines after 34...Qh4+ in detail in his report about this round. His comments and analysis of all rounds are published on the Tournament website (where you can also, amongst other things, find the regulations), and can be played through online. Moreover, registered users of the InfinityChess web-portals can Download all games. From now on selected games are to be shown per live-video (Tuesdays, Thursdays and Saturdays starting at 8 pm CET).

How do things continue with Freestyle?
InfinityChess does not want to leave things with running just the current round robin tournament, but rather sees this as a test for organizing more tournaments, in which prizes can be won. There are ideas for organizing a 13 round Freestyle Open in June 2014, and there’s the idea for a Freestyle World Championship. Similar to a classical World Championship this would be a duel of two players, who will qualify in a candidates tournament taking place before the World Championship match. This duel, however, would not take place in the virtual world of the internet but at a real location. In that case, and with adequate prize money, the one or the other chess professional might get interested in Freestyle. In contrast to previous years, in which the somewhat computer-illiterate chess masters did not fare that well in the young discipline of Freestyle Chess, they meanwhile have learned a lot about the techniques and intricacies of handling hardware and software – particularly so after the appearance of Houdini – and with some training they can harbor justified hopes for winning some prize money in such a tournament. After years of stagnation it would be very good for Freestyle if it thus could establish itself firmly in the tournament calendar as the "Formula 1 of Chess".

    Tournament Director 
   Arno Nickel (ICCF GM)

(Translation from German by Johannes Fischer)

retrieved from Infinity article ,year 2014

No comments:

Post a Comment