Endgame database benchmarks

Discussion about development of draughts in the time of computer and Internet.
BertTuyt
Posts: 1592
Joined: Wed Sep 01, 2004 19:42

Re: Endgame database benchmarks

Post by BertTuyt »

I bought a 80GB Intel SSD X25-M G2 this week :)
Large enough to fit my 7P DB.
Hope i have time during this or next weekend to install it in my PC.
Will run the Ed benchmark to see the difference with my normal HD.
Keep you posted.....

Bert
Ed Gilbert
Posts: 860
Joined: Sat Apr 28, 2007 14:53
Real name: Ed Gilbert
Location: Morristown, NJ USA
Contact:

Hola

Post by Ed Gilbert »

This forum has been awfully quiet lately! Where did everyone go? Watching world cup I imagine :-)

Bert, did you get to try your SSD drive? I'm interested to hear about it.

Gerard, what are you up to? Probably riding your bike in the Alps and getting ready to follow the Tour. Have you made any progress solving the parallel search problem in Damy?

Harm, how is that new 64-bit bitboard engine working?

Rein I hear has a search working in his new engine. I am looking forward to seeing it.

I have recovered from my broken foot and have been doing a lot of bicycling. Also busy giving the kingsrow user interface a face lift.

-- Ed
Rein Halbersma
Posts: 1722
Joined: Wed Apr 14, 2004 16:04
Contact:

Re: Hola

Post by Rein Halbersma »

Ed Gilbert wrote:This forum has been awfully quiet lately! Where did everyone go? Watching world cup I imagine :-)

Bert, did you get to try your SSD drive? I'm interested to hear about it.

Gerard, what are you up to? Probably riding your bike in the Alps and getting ready to follow the Tour. Have you made any progress solving the parallel search problem in Damy?

Harm, how is that new 64-bit bitboard engine working?

Rein I hear has a search working in his new engine. I am looking forward to seeing it.

I have recovered from my broken foot and have been doing a lot of bicycling. Also busy giving the kingsrow user interface a face lift.

-- Ed
Hi Ed,

My search function is slowly coming together. I've been browsing the Stockfish source code quite a bit, as well as the BB+ report on the Rybka/Ippolit similarities. The Stockfish code is hard to digest though, with lots of 200+ line functions that almost but not quite do the same. I'm working on abstracting it into small-scale function templates building blocks that can more easily be combined, a la my move generator. I've also been testing my hash table quite a bit, more on that later this week...

Rein
Ed Gilbert
Posts: 860
Joined: Sat Apr 28, 2007 14:53
Real name: Ed Gilbert
Location: Morristown, NJ USA
Contact:

Re: Endgame database benchmarks

Post by Ed Gilbert »

My search function is slowly coming together. I've been browsing the Stockfish source code quite a bit, as well as the BB+ report on the Rybka/Ippolit similarities. The Stockfish code is hard to digest though, with lots of 200+ line functions that almost but not quite do the same. I'm working on abstracting it into small-scale function templates building blocks that can more easily be combined, a la my move generator. I've also been testing my hash table quite a bit, more on that later this week...
I found the stockfish code helpful when I was working on the parallel search, but otherwise I did not learn much from it. Of course it is pvs where my search is mtd so there are those differences. But chess has so much more going on. They have a lot more information that they can use for move ordering, and that spills over into the ways they do extensions and truncations. Most of it seemed not exactly applicable to draughts.

-- Ed
Rein Halbersma
Posts: 1722
Joined: Wed Apr 14, 2004 16:04
Contact:

Re: Endgame database benchmarks

Post by Rein Halbersma »

Ed Gilbert wrote:
My search function is slowly coming together. I've been browsing the Stockfish source code quite a bit, as well as the BB+ report on the Rybka/Ippolit similarities. The Stockfish code is hard to digest though, with lots of 200+ line functions that almost but not quite do the same. I'm working on abstracting it into small-scale function templates building blocks that can more easily be combined, a la my move generator. I've also been testing my hash table quite a bit, more on that later this week...
I found the stockfish code helpful when I was working on the parallel search, but otherwise I did not learn much from it. Of course it is pvs where my search is mtd so there are those differences. But chess has so much more going on. They have a lot more information that they can use for move ordering, and that spills over into the ways they do extensions and truncations. Most of it seemed not exactly applicable to draughts.

-- Ed
Interesting that you think that chess has more information to base truncations on.

Of course, draughts does not have pawn structure, king safety, discovered checks and pinned pieces, but for most other concepts there are close draughts analogues. E.g. the pawn structure in chess models a long-term eval term that rarely changes during searches of moderate depth. In draughts, this could be captured by the structure of black and white pieces on the middle 2 rows. You could mimick the use a dedicated "pawn hash" hash table. King safety has no analogue of course, but breakthrough extensions should work almost the same.

There are also draughts specific features that have no chess analogue. E.g. the amount of actually and potentially controlled terrain (see the long discussion on the French forum by Gerard) in relation to the advancement of the pieces is very important in draughts. Also accurately estimating the number of defenders for outposts on the 6th row requires special care. This is somewhat similar to the SEE capture algorithm in chess.

For the search algorithm, I can see a role in draughts for almost all steps that the chess guys use. I think that even null moves can be made to work for draugths (the double null move trick by Vincent Diepeveen seems to take care of zugzwangs automatically).

One trick that is used in Stockfish is very interesting. They use LMR at fail-low nodes, but if after a reduced node the best opponent move is refuted by a move *connected* to the reduced move, then the reduction is aborted and the late move is re-searched to full depth.

Image
Here's an example: black to move. Let's say we are searching at d=5 and that the PV so far is 30-35 39-34 8-13 33-29 13-19. The next white responses to 30-35 are reduced by LMR so the line 30-35 42-37 35-40 is searched at d=3 with an eval score that sees an unopposed black breakthrough as a black win.

Now we start to search at d=7. The PV becomes 30-35 39-34 8-13 33-29 13-19 32-28 26-31. The next white response to 30-35 is now searched at d=5: 42-37 35-40 37-31! 26x28 33x2 with a white win. At this point the LMR will notice that 35-40 fails due to 37-31, which is connected to the previously reduced move 42-37. Therefore 42-37 will be re-searched to d=7 to resolve the threat 37-31. Apparently, in Stockfish this is better than simply trying to continue the search at d=5 to for responses to 42-37 better than 35-40.

Connection means in this case [from - X] (opponent move) [X - dest]. Other connection patterns are e.g. [X - dest] (opponent move) [from - X] (here the first move clears the space for the second move). See diagram below
Image
Here 30-35 31-27 enables the threat 37-31! The variations are the same here.

Rein
Ed Gilbert
Posts: 860
Joined: Sat Apr 28, 2007 14:53
Real name: Ed Gilbert
Location: Morristown, NJ USA
Contact:

Re: Endgame database benchmarks

Post by Ed Gilbert »

Hi Rein,
Interesting that you think that chess has more information to base truncations on.

Of course, draughts does not have pawn structure, king safety, discovered checks and pinned pieces, but for most other concepts there are close draughts analogues. E.g. the pawn structure in chess models a long-term eval term that rarely changes during searches of moderate depth. In draughts, this could be captured by the structure of black and white pieces on the middle 2 rows. You could mimick the use a dedicated "pawn hash" hash table. King safety has no analogue of course, but breakthrough extensions should work almost the same.

There are also draughts specific features that have no chess analogue. E.g. the amount of actually and potentially controlled terrain (see the long discussion on the French forum by Gerard) in relation to the advancement of the pieces is very important in draughts. Also accurately estimating the number of defenders for outposts on the 6th row requires special care. This is somewhat similar to the SEE capture algorithm in chess.
Actually when I said Stockfish I meant Glarung which is the predecessor of Stockfish and which is the source code that I found helpful when I was implementing a parallel search. I have looked at Stockfish also but I am more familiar with Glarung.

From reading the discussions among the chess programmers, one of the biggest elo gains in the last 15 or so years occurred due to LMR. For those that are not familier, LMR is short for late move reductions. The idea is that in an alphabeta search you of course order the moves by searching the expected strongest ones first, and if you still do not have a beta cutoff after the first few moves then you can reduce the search depth for the remaining moves, thus saving search time and allowing a deeper search in the same time for the better moves that were ordered near the top of the list. In chess you have some additional info to use for ordering moves that does not exist in draughts -- checks, discovered checks, passed pawns, and the exchange evaluator which considers the worth of the exchanged pieces. In common we have hashtable moves, killer moves, and in some cases the history table moves, although I see that Crafty no longer uses that, which is is also interesting as this is one of the move ordering terms in all versions of kingsrow. Some of the other terms that you mentioned for draughts are terms that I use in position evaluation but have not considered using them for move ordering. Maybe I should! I think at one time I tried using the static evaluation in the 8x8 checkers kingsrow for move ordering. It was quite a while ago and I cannot remember the results of the experiment but since I'm not using that now I have to assume it was not a positive result. The structure of the center pieces in the middle rows, breakthroughs, and the evaluation of outposts are certainly important eval terms, as are tempo, and perhaps controlled terrain, or at least reserved moves in a closed game, but I have not been using these for move ordering. My point was that I tried LMR for draughts and it did not work nearly as well as the scheme that I was already using for reductions. It's always difficult to know for sure why something 'works' or doesn't, but my suspicion about LMR was that it works better in chess due to the better move ordering info in that game.

I never tried a null move in draughts and just assumed that it could not work because of the nature of the game. Do you have reference for the double move trick by Vincent D.?

-- Ed
Rein Halbersma
Posts: 1722
Joined: Wed Apr 14, 2004 16:04
Contact:

Re: Endgame database benchmarks

Post by Rein Halbersma »

Ed Gilbert wrote:Hi Rein,

Actually when I said Stockfish I meant Glarung which is the predecessor of Stockfish and which is the source code that I found helpful when I was implementing a parallel search. I have looked at Stockfish also but I am more familiar with Glarung.

From reading the discussions among the chess programmers, one of the biggest elo gains in the last 15 or so years occurred due to LMR. For those that are not familier, LMR is short for late move reductions. The idea is that in an alphabeta search you of course order the moves by searching the expected strongest ones first, and if you still do not have a beta cutoff after the first few moves then you can reduce the search depth for the remaining moves, thus saving search time and allowing a deeper search in the same time for the better moves that were ordered near the top of the list. In chess you have some additional info to use for ordering moves that does not exist in draughts -- checks, discovered checks, passed pawns, and the exchange evaluator which considers the worth of the exchanged pieces. In common we have hashtable moves, killer moves, and in some cases the history table moves, although I see that Crafty no longer uses that, which is is also interesting as this is one of the move ordering terms in all versions of kingsrow. Some of the other terms that you mentioned for draughts are terms that I use in position evaluation but have not considered using them for move ordering. Maybe I should! I think at one time I tried using the static evaluation in the 8x8 checkers kingsrow for move ordering. It was quite a while ago and I cannot remember the results of the experiment but since I'm not using that now I have to assume it was not a positive result. The structure of the center pieces in the middle rows, breakthroughs, and the evaluation of outposts are certainly important eval terms, as are tempo, and perhaps controlled terrain, or at least reserved moves in a closed game, but I have not been using these for move ordering. My point was that I tried LMR for draughts and it did not work nearly as well as the scheme that I was already using for reductions. It's always difficult to know for sure why something 'works' or doesn't, but my suspicion about LMR was that it works better in chess due to the better move ordering info in that game.

I never tried a null move in draughts and just assumed that it could not work because of the nature of the game. Do you have reference for the double move trick by Vincent D.?

-- Ed
http://www.talkchess.com/forum/viewtopi ... 03&t=29176
http://talkchess.com/forum/viewtopic.ph ... e+checkers

At every node, after searching the TT move, but before any other move, you first try a null move (i.e. simply passing the right to move) at a reduced depth (R = 3 typically, sometimes R=2 at low depths). If the reduced search does not fail high, then you keep searching all other moves. Null moves in chess can not be made when in check. In draughts, you cannot make a null move when you have to capture.

Null moves are made recursively in the entire search. Some engines only try it if the static (or lazy) eval is near beta. Others try it everywhere. The only exception is that you can at most have 2 null moves in a row. This will take care of immediate zugzwangs. E.g. if all pieces are in direct opposition, a variation like [null null something] will immediately lose material and fail low. This means that the second null move fails high and the first null move fails low and an incorrect pruning is avoided. Note that the minimal search depth to detect an immediate zugzwang with a double null move is 2 R + 3. Distant zugzwangs are more difficult to detect of course, but the same holds for a regular search (albeit to a lesser degree).

Captures are a bit subtle. First, you cannot null move when you yourself have to capture. But what if your opponent threatens to capture? If you null move, he captures 1 piece and you recapture (say) 2 or more pieces, it looks like you win material. But how do you know for sure that you have a passing move? So one should allow an exception: you cannot make a null move when you have to capture, unless it is a response to a previous null move (to let the opponent first prove that he can let you capture). An alternative way would be to forbid making a null move when the opponent threatens a capture but to first search a normal move at depth = d - 2 R -2 (so still making the double null move reduction, but really doing a null move + a normal capture move) to see if you can safely let the opponent capture.

Of course, there's no guarantee that it will work. Perhaps zugzwangs are too prevalant or too deep to detect efficiently in draughts. I'll try it out though!

Rein
TAILLE
Posts: 968
Joined: Thu Apr 26, 2007 18:51
Location: FRANCE

Re: Hola

Post by TAILLE »

Ed Gilbert wrote:This forum has been awfully quiet lately! Where did everyone go? Watching world cup I imagine :-)

Gerard, what are you up to? Probably riding your bike in the Alps and getting ready to follow the Tour. Have you made any progress solving the parallel search problem in Damy?

-- Ed
Hi Ed.
Yes Ed. I am still working hard on my parallel search algorithm and also on a parallel generation process in order to be able to begin the 8 pieces database generation. Now I am able to genarate the 6 pieces database in about 4 hours with two quad processors.
In the same time I am completly changing my endgame database format which was not optimised to work with the MTD_f based algorithm I use in my engine. The idea is the following : when you use an MTD-f algorithm and you reach a position that is in the endgame database you do not need to have the exact value (loss, draw, win) of the position. If for example the test value is greater then 0 then you only need to know if the result is a win. If not it does not matter if the result is a draw or a loss.
My new 6 pieces endgame database is now made of two sub databases in order to answer the two following questions :
1) Is the concerned position a winning position ?
2) Is the concerned position a losing position ?
What are the consequences ? I have an extension of the db on the disk (about 20%) but, when reading the disk I avoid to store in memory a great amount of unuseful information. Because it does not harm to have this 20% extension on the disk I suspect it is a good improvment.
Another advantage is the following : it must be possible to store in memory all the losing positions of the 7 pieces endgames db and avoid any disk access during a real game.
Gérard
MichelG
Posts: 244
Joined: Sun Dec 28, 2003 20:24
Contact:

Re: Hola

Post by MichelG »

TAILLE wrote: Hi Ed.
Yes Ed. I am still working hard on my parallel search algorithm and also on a parallel generation process in order to be able to begin the 8 pieces database generation. Now I am able to genarate the 6 pieces database in about 4 hours with two quad processors.
Is that 4 hours for the complete (3v3, 4v2, 5v1) databases? That's quite impressive!

I am also working creating the 8 piece databases. The code runs parallel, but it needs some work on running faster. Did you make algorithmic improvements, or did you just optimize your existing code?



Michel
TAILLE
Posts: 968
Joined: Thu Apr 26, 2007 18:51
Location: FRANCE

Re: Hola

Post by TAILLE »

MichelG wrote:
TAILLE wrote: Hi Ed.
Yes Ed. I am still working hard on my parallel search algorithm and also on a parallel generation process in order to be able to begin the 8 pieces database generation. Now I am able to genarate the 6 pieces database in about 4 hours with two quad processors.
Is that 4 hours for the complete (3v3, 4v2, 5v1) databases? That's quite impressive!

I am also working creating the 8 piece databases. The code runs parallel, but it needs some work on running faster. Did you make algorithmic improvements, or did you just optimize your existing code?



Michel
Yes Michel I need 4 hours for genarating the complete (3v3, 4v2, 5v1) db. This includes the compression phase but not the verification process. Verification is another problem. By the way and I need far more time for verification than for building the db. For the 8 db I intend to limit a bit more the verification process.

What is your approach for this verification phase ?
Gérard
Rein Halbersma
Posts: 1722
Joined: Wed Apr 14, 2004 16:04
Contact:

Re: Hola

Post by Rein Halbersma »

TAILLE wrote:
MichelG wrote:
TAILLE wrote: Hi Ed.
Yes Ed. I am still working hard on my parallel search algorithm and also on a parallel generation process in order to be able to begin the 8 pieces database generation. Now I am able to genarate the 6 pieces database in about 4 hours with two quad processors.
Is that 4 hours for the complete (3v3, 4v2, 5v1) databases? That's quite impressive!

I am also working creating the 8 piece databases. The code runs parallel, but it needs some work on running faster. Did you make algorithmic improvements, or did you just optimize your existing code?



Michel
Yes Michel I need 4 hours for genarating the complete (3v3, 4v2, 5v1) db. This includes the compression phase but not the verification process. Verification is another problem. By the way and I need far more time for verification than for building the db. For the 8 db I intend to limit a bit more the verification process.

What is your approach for this verification phase ?
Why don't you build the databases twice and check whether they give the same results? That and the comparison with already published results by Ed and others should make sure your results are correct.
TAILLE
Posts: 968
Joined: Thu Apr 26, 2007 18:51
Location: FRANCE

Re: Hola

Post by TAILLE »

Hi Rein,
Rein Halbersma wrote: Why don't you build the databases twice and check whether they give the same results? That and the comparison with already published results by Ed and others should make sure your results are correct.
That's not so simple. When I rebuilt for the first time my 2,3,4,5 and 6 pieces db with my new multithread algorithm I was not able to discover the inconsistencies of my db : after each new db built the verification process was OK and all the results seemed identical to my "old" db and to the already results published.
Fortunetly I ran again a complete verification of the db and I discovered a bug that disturbed some indexes of the 4 pieces db will building the 6 pieces db. The problem is now fixed and from that experience I intend now to build the complete 7 pieces db before running a complete verification of the 2-7 db.
For the 8 pieces db I will have the same approach : I intend to generate first the complete db, without any verification other than comparing the number of loss, draw, win with Ed, and then I will run a (probably partial) verification process.
Gérard
Ed Gilbert
Posts: 860
Joined: Sat Apr 28, 2007 14:53
Real name: Ed Gilbert
Location: Morristown, NJ USA
Contact:

Re: Endgame database benchmarks

Post by Ed Gilbert »

It is nice to hear that others are working on the 8pc database. I also heard from Gijsbert Wiesenekker recently that he is building the distance to win database for 7 pieces.

I would be interested to hear how others are doing the concurrency. I subdivided the 8pc db into hundreds of thousands of small pieces which at any instant in time typically allowed 100 or more subdivisions to be built in parallel. Schaeffer did not subdivide nearly as much but instead broke the build process up into smaller steps -- capture pass, conversion pass, non-conversion passes, compression, and verification, and used a work queue that could hand out one of these tasks to an available computer on the network. My build program is single threaded as once it gets the information about the next subdivision to build from the master of the cluster it can do all those steps unassisted, and then send the finished compressed result back to the master and move on to the next one. Are you running many instances of one build program as I did, or is it one program with many threads? How do you subdivide the work?

-- Ed
MichelG
Posts: 244
Joined: Sun Dec 28, 2003 20:24
Contact:

Re: Endgame database benchmarks

Post by MichelG »

Dragon runs as a single, multithreaded application. I try i have it work on as big as possible databases that still fit in memory. Currently the databases are split according to the number of pieces, like 1210 for a database with 1 man, 2 crown and 1 black man.

For databases with 7 pieces, these are subdivided according to the rank of the most advanced man (splitting into maximal 9*9=81 subslices). And with 8 pieces i subdivide according to the position of the most advanced piece (45*45 slices)

At each iteration all cores work on the same database. I think that works quite well, because each of the passes through the database can be multithreaded quite efficiently :-) The only disadvantage is that you can't easily run it concurrently on several computers, but it is possible manually by generating 5 vs 3 in parallel with 4 vs 4 pieces.

I didn't parallelize the compression phase though as it still runs fast enough to be not much of a problem.

One problem that dragon still has, is that it algorithm is intended for computing distances to win instead of only win/draw/lose, i think that is slowing it down a bit, and i am not quite happy about the speed it runs (5 piece database build takes 45 minutes on a q6600 or about 30 minutes on a i860)
I also heard from Gijsbert Wiesenekker recently that he is building the distance to win database for 7 pieces.
I already did a (partial) calculation of these. A slight complication is that some of the positions take 256 ply to solve which means you can't fit a position in a byte. Dragon uses 10 bits/position now to handle that.

You say that verifying takes more time than the original computing. For me, i just check for consistency within the database and that takes about half the time of the build. It isn't a true verification because it wouldn't find errors in for example the move generator, but it does find (virtually all) hardware errors. Other than that, running the program twice would be the only option to make absolutely sure there are no hardware errors.

Michel
TAILLE
Posts: 968
Joined: Thu Apr 26, 2007 18:51
Location: FRANCE

Re: Endgame database benchmarks

Post by TAILLE »

MichelG wrote:Dragon runs as a single, multithreaded application. I try i have it work on as big as possible databases that still fit in memory. Currently the databases are split according to the number of pieces, like 1210 for a database with 1 man, 2 crown and 1 black man.

For databases with 7 pieces, these are subdivided according to the rank of the most advanced man (splitting into maximal 9*9=81 subslices). And with 8 pieces i subdivide according to the position of the most advanced piece (45*45 slices)

At each iteration all cores work on the same database. I think that works quite well, because each of the passes through the database can be multithreaded quite efficiently :-) The only disadvantage is that you can't easily run it concurrently on several computers, but it is possible manually by generating 5 vs 3 in parallel with 4 vs 4 pieces.

I didn't parallelize the compression phase though as it still runs fast enough to be not much of a problem.

One problem that dragon still has, is that it algorithm is intended for computing distances to win instead of only win/draw/lose, i think that is slowing it down a bit, and i am not quite happy about the speed it runs (5 piece database build takes 45 minutes on a q6600 or about 30 minutes on a i860)
I also heard from Gijsbert Wiesenekker recently that he is building the distance to win database for 7 pieces.
I already did a (partial) calculation of these. A slight complication is that some of the positions take 256 ply to solve which means you can't fit a position in a byte. Dragon uses 10 bits/position now to handle that.

You say that verifying takes more time than the original computing. For me, i just check for consistency within the database and that takes about half the time of the build. It isn't a true verification because it wouldn't find errors in for example the move generator, but it does find (virtually all) hardware errors. Other than that, running the program twice would be the only option to make absolutely sure there are no hardware errors.

Michel
Like Dragon Damy runs a single, multithreaded application, all cores working on the same database. Subdivisions for Damy are different. For a db without men Damy create 100 subdivisoins based on the position indexes. For db with men the subdivisions are based on the men configuration of one side.
Generation time with two quad i7, Damy using 7 cores:
complete 5 pieces db : 5'
complete 6 pieces db : 4h
0025 db (bm, wm, bk, wk) : 35' (without symmetry consideration)
0124 db : 2h 50'
Gérard
Post Reply