Internet engine matches

Discussion about development of draughts in the time of computer and Internet.
Post Reply
Ed Gilbert
Posts: 860
Joined: Sat Apr 28, 2007 14:53
Real name: Ed Gilbert
Location: Morristown, NJ USA
Contact:

Re: Internet engine matches

Post by Ed Gilbert »

Kingsrow and Damage are running another internet engine match this weekend. We got a late start on Saturday so it's not finished yet. Damage is testing its new 7pc db and kingsrow is using its 8pc db. I'll post something after the weekend.

We also had some internet matches in 2009 but I don't see them described in this thread. Maybe in a different thread but can't find it at the moment.

-- Ed
Ed Gilbert
Posts: 860
Joined: Sat Apr 28, 2007 14:53
Real name: Ed Gilbert
Location: Morristown, NJ USA
Contact:

Re: Internet engine matches

Post by Ed Gilbert »

Kingsrow and damage played another internet engine match last weekend. This was I think the fourth match of this type. The most recent one previously was played in July 2008. New for damage is a 7pc endgame db and some eval improvements. New for kingsrow is a full 8pc db and also eval improvements.

In the previous matches we purposely setup both engines to be configured as simlar as possible, so no pondering, both using 6pc databases, and single-threaded search. This time I decided to use a more fully configured kingsrow, so it ran with 4 search threads, pondering on, and using the full 8pc db plus the 5men vs. 4men subset of the 9pc db, running on a 2.4GHz Q6600 with 8GB ram. Damage ran a single search thread on (IIRC) a 3GHz I7 with 6GB ram.

Time controls were the same as previous matches. Kingsrow was setup to make 75 moves in 10 minutes. Damage does not have this type of total match time management, so it was set for 10 seconds per move. This is more than 10 minutes for 75 moves but as most games are shorter this made the average time used about the same for both.

The results for kingsrow vs. damage: 24 wins, 4 losses, 130 draws, 0 unknowns.

While this looks good for kingsrow, it represents a noticeable improvement for damage. The results of the previous match were +30, -3, =123, U2, and this time kingsrow had some endgame db, pondering, and parallel search advantages that it did not have in previous matches. It is difficult to draw conclusions because there is a lot of randomness in the results from "only" 158 games, but from these results is looks probable that damage is improving more quickly than kingsrow.

We had 2 interruptions during the match. After about 20 games, damage crashed with what Bert thinks is an endgame database driver bug. He had seen this before and he has been trying to fix it, but it happens randomly and not very often, so it is difficult to troubleshoot. The other interruption occured after about 100 games. Bert's local internet service was down for a short period. He could not access the internet at all, and our Dam Exchange connection was lost. After it was restored we reconnected and resumed, there were no further interruptions.

These matches are always interesting and entertaining to watch, and I spent quite a few hours over the weekend just watching the games. I have been recovering from a broken bone in my foot, the result of a running injury, and I'm not too mobile at the moment anyway, so this was a nice way for me to spend some recovery time. Bert and I communicated during the match via Yahoo Messenger, and as always the conversations were lively. Among other things we started brainstorming about ways to create a Flits Dxp server, something similar to what we had created for Truus. We are persuing a few ideas and maybe something useful will eventually develop from it.

-- Ed
TAILLE
Posts: 968
Joined: Thu Apr 26, 2007 18:51
Location: FRANCE

Re: Internet engine matches

Post by TAILLE »

Hi Ed,

Very interesting result. It seems effectively that Damage has been greatly improved.
The question now is the following : how many games were won due to the non existence of a 8 pieces datebase in Damage ? In other words how many loosing moves were played by Damage in the end of the game where Kingsrow was able to calculate the exact result ?

Though I am not yet ready to program a DXP interface on Damy I feel very interesting to have such interface on Flits for the future because I consider Flits as a very strong program.

For your information, while I am trying to stabilise my multithread version, I am working on my endgame db generator in order to build a 7 pieces db taking into account the draw+ results. Without any compression this complete 7p db will take about 410Gb.
I see three advantages to have this WLDPlus 7p db :
1) draw+ is taken into account at the top level, for the world championship
2) playing against a program with only the 6p db, it seems far more interesting to play a move garantiing a draw+ rather only a draw : the advantage is better and the probablity for the opponent (the 6d db program) to play a mistake must be higher
3) it is a funny job for me !
Gérard
Ed Gilbert
Posts: 860
Joined: Sat Apr 28, 2007 14:53
Real name: Ed Gilbert
Location: Morristown, NJ USA
Contact:

Re: Internet engine matches

Post by Ed Gilbert »

Hi Gerard,
The question now is the following : how many games were won due to the non existence of a 8 pieces datebase in Damage ? In other words how many loosing moves were played by Damage in the end of the game where Kingsrow was able to calculate the exact result ?
That is difficult to know, as I don't have any knowledge of what damage was showing for its search scores. My experience with matches against truus is that the larger dbs make very little difference in match results. They are much more useful for analyzing games, determining if a difficult ending is a win or draw, determining exactly which game move is the losing move, etc. You can see an example of using different dbs in these results reported in another thread on this forum:
Match 1: kingsrow/9pc vs. truus: 36 wins, 1 losses, 121 draws, 0 unknowns
Match 2: kingsrow/6pc vs. truus: 33 wins, 1 losses, 122 draws, 2 unknowns

For your information, while I am trying to stabilise my multithread version, I am working on my endgame db generator in order to build a 7 pieces db taking into account the draw+ results. Without any compression this complete 7p db will take about 410Gb.
There is no doubt that this gives you better information, but there is a cost. You have to store one of 5 possible values for each position instead of 3, so the dbs will be larger. In my compression scheme they would be about 40% larger. This means that you can store fewer positions in db cache, so that is the downside of having this better information. I don't know which is better. What I did instead is to use a heuristic to calculate a db+ or db- for each position that returns db draw from the endgame db. This of course is not as good as building it into the db, but it doesn't affect the size of db cache or have a noticeable affect on search speed.

-- Ed
BertTuyt
Posts: 1592
Joined: Wed Sep 01, 2004 19:42

Re: Internet engine matches

Post by BertTuyt »

Although I dont want to work on Damage this year, I nevertheless implemented some small changes in the MoveGenerator (based on previous Perft optimizations which i did not include in the main program yet), and Search.

As the Horizon engine is based on damage search , and MoveGenerator, these "improvements" will also benefit this program.
To test/verify if no bugs were included with these code modifications I run a 158 match against Kingsrow (not via the Internet this time :D ).
Settings 10 Minutes/Game, Both Damage as Kingsrow played with 1 core only, and without pondering.

The good news, during the match, Damage did not crash at all, which is a signal that the code is most likely without major flaws.
The bad news (but not unexpected :D ), Kingsrow is still better (it takes more then a few snall changes to beat Ed)...

Herewith the stats, from the Kingsrow perspective.

Win: 16, Loose: 3, Draw: 139.

Keep you all posted,

Bert
Rein Halbersma
Posts: 1722
Joined: Wed Apr 14, 2004 16:04
Contact:

Re: Internet engine matches

Post by Rein Halbersma »

BertTuyt wrote:Although I dont want to work on Damage this year, I nevertheless implemented some small changes in the MoveGenerator (based on previous Perft optimizations which i did not include in the main program yet), and Search.

As the Horizon engine is based on damage search , and MoveGenerator, these "improvements" will also benefit this program.
Can you share some of your insights with us?
To test/verify if no bugs were included with these code modifications I run a 158 match against Kingsrow (not via the Internet this time :D ).
Settings 10 Minutes/Game, Both Damage as Kingsrow played with 1 core only, and without pondering.

The good news, during the match, Damage did not crash at all, which is a signal that the code is most likely without major flaws.
The bad news (but not unexpected :D ), Kingsrow is still better (it takes more then a few snall changes to beat Ed)...

Herewith the stats, from the Kingsrow perspective.

Win: 16, Loose: 3, Draw: 139.
Very nice score! I guess this makes Damage the 2nd strongest draughts entity on the planet right now.

Rein
Krzychumag
Posts: 145
Joined: Tue Sep 01, 2009 17:31
Real name: Krzysztof Grzelak

Re: Internet engine matches

Post by Krzychumag »

And what the version of programme Kingsrow played Bert.
BertTuyt
Posts: 1592
Joined: Wed Sep 01, 2004 19:42

Re: Internet engine matches

Post by BertTuyt »

I played with Damage version 11.1 :D

Some of the insights (and not rocket science).

On my i7940 the 64-bit KingsrowEngine has a speed (at least during the opening and middle games, so without DB-reads) around 3500 -4000 kN/sec.
Damage is around 2500-3000 kN/sec, so increasing the Nodes/sec is one area for improvement.

Maybe I use a heavier evaluation, but in a late mid game position (with partial zugzwang) and/or breakthrough), it is better to search deeper (so speed is key here).
Having said this, I should also switch to a low weight evaluation when the number of pieces drop below a specific threshold.
As a result also the search-value will differ less (all these different evaluation criteria introduce small score differences) , which will improve the alpha-beta search.

During all Perft tests I optimized some routines, making them completely bitboard-based.
The older versions (movegen) used a mix of bitboards and also integers with the specific board value. See below code as an example.

Code: Select all

	while ( bbKing ) {

		_BitScanForward64(&iPosition, bbKing) ;

		bbPosition0 = (BITBOARD)1<<iPosition ;
		bbKing		^= bbPosition0 ;

In the code (from the KingMove Routine) I scan trough the King Bitboard bbKing, but use a mix of integer position and bitboard.
In the more recent routines, I want to base the routines completely on Bitboards.

The integer position (which is needed for example in the hash key update) is only determined during the DoMove.
As often only a few moves are examined due to beta cut-off, the "costs" of determine the exact position is then saved, resulting into a small speed gain.

Not only the Nodes/sec speed of Kingsrow is high, the program also can search extremely deep (especially in the near end game).
For this reason i made the pruning mechanism somewhat more aggressive (so reducing the search-depth of non interesting moves sequences .)

I think that it takes more, so I also need to improve breakthrough mechanism in a second phase, and add somewhat more extensions.

Last but not least, during the endgame the Damage Nodes/sec reduce to around 100 kNodes/sec (due to a frequent cache-miss in the DB cache).
I still think I have a bug, as when the cache is not full (and i need to read all 4kByte DB-blocks) the speed-reduction is somewhat less.
Next to that, it could be that in case of a DB cache-miss it could be safe (in specific circumstances) to rely on a heuristic DB-value.

So based on all this, hope i can limit the KingsRow wins to below 10 in 2012 :)

Keep you all posted

Bert
Krzychumag
Posts: 145
Joined: Tue Sep 01, 2009 17:31
Real name: Krzysztof Grzelak

Re: Internet engine matches

Post by Krzychumag »

What version of programme Kingsrow played Bert.
BertTuyt
Posts: 1592
Joined: Wed Sep 01, 2004 19:42

Re: Internet engine matches

Post by BertTuyt »

Kingsrow version 1.51 (64Bit)

Bert
Ed Gilbert
Posts: 860
Joined: Sat Apr 28, 2007 14:53
Real name: Ed Gilbert
Location: Morristown, NJ USA
Contact:

Re: Internet engine matches

Post by Ed Gilbert »

Hi Bert,

I was wondering about some of the program settings in the match (for both kingsrow and damage).
- egdb pieces (I know only 6 for kingsrow, but damage?)
- hashtable size
- egdb driver memory used
- opening book setting
Last but not least, during the endgame the Damage Nodes/sec reduce to around 100 kNodes/sec (due to a frequent cache-miss in the DB cache).
I still think I have a bug, as when the cache is not full (and i need to read all 4kByte DB-blocks) the speed-reduction is somewhat less.
Next to that, it could be that in case of a DB cache-miss it could be safe (in specific circumstances) to rely on a heuristic DB-value.
I am a little surprised that you experienced this much slowdown in endings, unless you were using something larger than the 6pc db. The 6pc db is only ~1gb and should fit completely in your cache memory, unless you were using a very small cache memory setting. Even if you were only using a few hundred mb for cache, you should have very few cache misses with a 6pc db.

-- Ed
BertTuyt
Posts: 1592
Joined: Wed Sep 01, 2004 19:42

Re: Internet engine matches

Post by BertTuyt »

Ed, herewith some information (had to check, as I used standard settings for both):

Kingsrow:
OpeningBook = Best Moves
HashTable Size = 128 MB
DB cache Size = 10000 MB, 6P
Pondering = Off
Time = 10 Min / 65 Moves

Damage:
OpeningBook = None
HashTable Size = 128 MB ( = 8M entries, each entry 16 Byte)
DB cache Size = 4GB ( 1M Entries 4KByte), 7P
Pondering = Off
Time = 10 seconds/Move

You are right I used a 7P DB (which does not fit in my cache :D ).
But as the ratio Read DB-Position and Cache-Miss approaches 50% when the DB-Cache is full I expect that the FIFO mechanism does not work well (yet).

In a previous post you pointed out that DB's with 6P or more does not have a huge impact, so I used the 7P as I still wanted to check if some DB problems (crash, ultra slow down) still occur....

Keep you all posted,

Bert
BertTuyt
Posts: 1592
Joined: Wed Sep 01, 2004 19:42

Re: Internet engine matches

Post by BertTuyt »

I found the bug in the Endgame DB-handler.
So at least no crashes and also no dramatic slow-down in the end phase.
To test if all works I played another match (with similar settings) against Kingsrow.
This time a somewhat worse result (so apparently the bug did not have a major impact on playing strength).
But I guess this is normal statistical variation (Ed, hope you can confirm).

Match Results, from the perspective of KingsRow.

Win: 21, Loose: 5, Draw: 132

I assume this is as close as other program came so far (but also Ed, you know most likely best).
I'm curious if Jan-Jaap has recent results against Kingsrow, if i remember well , his match results were slightly worse.

A pity that we dont have Damy info.
As I believe that Damy is very good, so I'm not sure if Damage comes second, but third is not that bad.
And at least my idea-box is far from empty, so i hope i can present better match results in 2012.

Keep you posted,

Bert
BertTuyt
Posts: 1592
Joined: Wed Sep 01, 2004 19:42

Re: Internet engine matches

Post by BertTuyt »

Forgot to ask.

Based on the recent match outcomes, what is (by approximation) the Rating difference between Damage and KingsRow...

Bert
Ed Gilbert
Posts: 860
Joined: Sat Apr 28, 2007 14:53
Real name: Ed Gilbert
Location: Morristown, NJ USA
Contact:

Re: Internet engine matches

Post by Ed Gilbert »

Bert,

Here is the output from bayeselo.

ResultSet>addplayer kingsrow
ResultSet>addplayer damage
ResultSet>addwld 0 1 21 5 132
ResultSet>addwld 0 1 16 3 139
ResultSet>elo
ResultSet-EloRating>advantage 0
0
ResultSet-EloRating>mm
00:00:00,00
ResultSet-EloRating>exactdist
00:00:00,00
ResultSet-EloRating>ratings
Rank Name Elo + - games score oppo. draws
1 kingsrow 11 14 14 316 55% -11 86%
2 damage -11 14 14 316 45% 11 86%
Post Reply