Endgame database benchmarks

BertTuyt · Post by **BertTuyt** » Thu May 27, 2010 21:13

I bought a 80GB Intel SSD X25-M G2 this week

Large enough to fit my 7P DB.
Hope i have time during this or next weekend to install it in my PC.
Will run the Ed benchmark to see the difference with my normal HD.
Keep you posted.....

Bert

Ed Gilbert · Post by **Ed Gilbert** » Sun Jun 27, 2010 16:25

This forum has been awfully quiet lately! Where did everyone go? Watching world cup I imagine :-)

Bert, did you get to try your SSD drive? I'm interested to hear about it.

Gerard, what are you up to? Probably riding your bike in the Alps and getting ready to follow the Tour. Have you made any progress solving the parallel search problem in Damy?

Harm, how is that new 64-bit bitboard engine working?

Rein I hear has a search working in his new engine. I am looking forward to seeing it.

I have recovered from my broken foot and have been doing a lot of bicycling. Also busy giving the kingsrow user interface a face lift.

-- Ed

Rein Halbersma · Post by **Rein Halbersma** » Mon Jun 28, 2010 12:33

Ed Gilbert wrote:This forum has been awfully quiet lately! Where did everyone go? Watching world cup I imagine

Bert, did you get to try your SSD drive? I'm interested to hear about it.

Gerard, what are you up to? Probably riding your bike in the Alps and getting ready to follow the Tour. Have you made any progress solving the parallel search problem in Damy?

Harm, how is that new 64-bit bitboard engine working?

Rein I hear has a search working in his new engine. I am looking forward to seeing it.

I have recovered from my broken foot and have been doing a lot of bicycling. Also busy giving the kingsrow user interface a face lift.

-- Ed

Hi Ed,

My search function is slowly coming together. I've been browsing the Stockfish source code quite a bit, as well as the BB+ report on the Rybka/Ippolit similarities. The Stockfish code is hard to digest though, with lots of 200+ line functions that almost but not quite do the same. I'm working on abstracting it into small-scale function templates building blocks that can more easily be combined, a la my move generator. I've also been testing my hash table quite a bit, more on that later this week...

Rein

Ed Gilbert · Post by **Ed Gilbert** » Tue Jun 29, 2010 00:24

My search function is slowly coming together. I've been browsing the Stockfish source code quite a bit, as well as the BB+ report on the Rybka/Ippolit similarities. The Stockfish code is hard to digest though, with lots of 200+ line functions that almost but not quite do the same. I'm working on abstracting it into small-scale function templates building blocks that can more easily be combined, a la my move generator. I've also been testing my hash table quite a bit, more on that later this week...

I found the stockfish code helpful when I was working on the parallel search, but otherwise I did not learn much from it. Of course it is pvs where my search is mtd so there are those differences. But chess has so much more going on. They have a lot more information that they can use for move ordering, and that spills over into the ways they do extensions and truncations. Most of it seemed not exactly applicable to draughts.

-- Ed

Rein Halbersma · Post by **Rein Halbersma** » Wed Jun 30, 2010 13:33

Ed Gilbert wrote:
My search function is slowly coming together. I've been browsing the Stockfish source code quite a bit, as well as the BB+ report on the Rybka/Ippolit similarities. The Stockfish code is hard to digest though, with lots of 200+ line functions that almost but not quite do the same. I'm working on abstracting it into small-scale function templates building blocks that can more easily be combined, a la my move generator. I've also been testing my hash table quite a bit, more on that later this week...
I found the stockfish code helpful when I was working on the parallel search, but otherwise I did not learn much from it. Of course it is pvs where my search is mtd so there are those differences. But chess has so much more going on. They have a lot more information that they can use for move ordering, and that spills over into the ways they do extensions and truncations. Most of it seemed not exactly applicable to draughts.

-- Ed

Interesting that you think that chess has more information to base truncations on.

Of course, draughts does not have pawn structure, king safety, discovered checks and pinned pieces, but for most other concepts there are close draughts analogues. E.g. the pawn structure in chess models a long-term eval term that rarely changes during searches of moderate depth. In draughts, this could be captured by the structure of black and white pieces on the middle 2 rows. You could mimick the use a dedicated "pawn hash" hash table. King safety has no analogue of course, but breakthrough extensions should work almost the same.

There are also draughts specific features that have no chess analogue. E.g. the amount of actually and potentially controlled terrain (see the long discussion on the French forum by Gerard) in relation to the advancement of the pieces is very important in draughts. Also accurately estimating the number of defenders for outposts on the 6th row requires special care. This is somewhat similar to the SEE capture algorithm in chess.

For the search algorithm, I can see a role in draughts for almost all steps that the chess guys use. I think that even null moves can be made to work for draugths (the double null move trick by Vincent Diepeveen seems to take care of zugzwangs automatically).

One trick that is used in Stockfish is very interesting. They use LMR at fail-low nodes, but if after a reduced node the best opponent move is refuted by a move *connected* to the reduced move, then the reduction is aborted and the late move is re-searched to full depth.

Here's an example: black to move. Let's say we are searching at d=5 and that the PV so far is 30-35 39-34 8-13 33-29 13-19. The next white responses to 30-35 are reduced by LMR so the line 30-35 42-37 35-40 is searched at d=3 with an eval score that sees an unopposed black breakthrough as a black win.

Now we start to search at d=7. The PV becomes 30-35 39-34 8-13 33-29 13-19 32-28 26-31. The next white response to 30-35 is now searched at d=5: 42-37 35-40 37-31! 26x28 33x2 with a white win. At this point the LMR will notice that 35-40 fails due to 37-31, which is connected to the previously reduced move 42-37. Therefore 42-37 will be re-searched to d=7 to resolve the threat 37-31. Apparently, in Stockfish this is better than simply trying to continue the search at d=5 to for responses to 42-37 better than 35-40.

Connection means in this case [from - X] (opponent move) [X - dest]. Other connection patterns are e.g. [X - dest] (opponent move) [from - X] (here the first move clears the space for the second move). See diagram below

Here 30-35 31-27 enables the threat 37-31! The variations are the same here.

Rein

Ed Gilbert · Post by **Ed Gilbert** » Thu Jul 01, 2010 00:57

Hi Rein,

Interesting that you think that chess has more information to base truncations on.

Of course, draughts does not have pawn structure, king safety, discovered checks and pinned pieces, but for most other concepts there are close draughts analogues. E.g. the pawn structure in chess models a long-term eval term that rarely changes during searches of moderate depth. In draughts, this could be captured by the structure of black and white pieces on the middle 2 rows. You could mimick the use a dedicated "pawn hash" hash table. King safety has no analogue of course, but breakthrough extensions should work almost the same.

There are also draughts specific features that have no chess analogue. E.g. the amount of actually and potentially controlled terrain (see the long discussion on the French forum by Gerard) in relation to the advancement of the pieces is very important in draughts. Also accurately estimating the number of defenders for outposts on the 6th row requires special care. This is somewhat similar to the SEE capture algorithm in chess.

Actually when I said Stockfish I meant Glarung which is the predecessor of Stockfish and which is the source code that I found helpful when I was implementing a parallel search. I have looked at Stockfish also but I am more familiar with Glarung.

From reading the discussions among the chess programmers, one of the biggest elo gains in the last 15 or so years occurred due to LMR. For those that are not familier, LMR is short for late move reductions. The idea is that in an alphabeta search you of course order the moves by searching the expected strongest ones first, and if you still do not have a beta cutoff after the first few moves then you can reduce the search depth for the remaining moves, thus saving search time and allowing a deeper search in the same time for the better moves that were ordered near the top of the list. In chess you have some additional info to use for ordering moves that does not exist in draughts -- checks, discovered checks, passed pawns, and the exchange evaluator which considers the worth of the exchanged pieces. In common we have hashtable moves, killer moves, and in some cases the history table moves, although I see that Crafty no longer uses that, which is is also interesting as this is one of the move ordering terms in all versions of kingsrow. Some of the other terms that you mentioned for draughts are terms that I use in position evaluation but have not considered using them for move ordering. Maybe I should! I think at one time I tried using the static evaluation in the 8x8 checkers kingsrow for move ordering. It was quite a while ago and I cannot remember the results of the experiment but since I'm not using that now I have to assume it was not a positive result. The structure of the center pieces in the middle rows, breakthroughs, and the evaluation of outposts are certainly important eval terms, as are tempo, and perhaps controlled terrain, or at least reserved moves in a closed game, but I have not been using these for move ordering. My point was that I tried LMR for draughts and it did not work nearly as well as the scheme that I was already using for reductions. It's always difficult to know for sure why something 'works' or doesn't, but my suspicion about LMR was that it works better in chess due to the better move ordering info in that game.

I never tried a null move in draughts and just assumed that it could not work because of the nature of the game. Do you have reference for the double move trick by Vincent D.?

-- Ed

Rein Halbersma · Post by **Rein Halbersma** » Thu Jul 01, 2010 12:50

Ed Gilbert wrote:Hi Rein,

Actually when I said Stockfish I meant Glarung which is the predecessor of Stockfish and which is the source code that I found helpful when I was implementing a parallel search. I have looked at Stockfish also but I am more familiar with Glarung.

From reading the discussions among the chess programmers, one of the biggest elo gains in the last 15 or so years occurred due to LMR. For those that are not familier, LMR is short for late move reductions. The idea is that in an alphabeta search you of course order the moves by searching the expected strongest ones first, and if you still do not have a beta cutoff after the first few moves then you can reduce the search depth for the remaining moves, thus saving search time and allowing a deeper search in the same time for the better moves that were ordered near the top of the list. In chess you have some additional info to use for ordering moves that does not exist in draughts -- checks, discovered checks, passed pawns, and the exchange evaluator which considers the worth of the exchanged pieces. In common we have hashtable moves, killer moves, and in some cases the history table moves, although I see that Crafty no longer uses that, which is is also interesting as this is one of the move ordering terms in all versions of kingsrow. Some of the other terms that you mentioned for draughts are terms that I use in position evaluation but have not considered using them for move ordering. Maybe I should! I think at one time I tried using the static evaluation in the 8x8 checkers kingsrow for move ordering. It was quite a while ago and I cannot remember the results of the experiment but since I'm not using that now I have to assume it was not a positive result. The structure of the center pieces in the middle rows, breakthroughs, and the evaluation of outposts are certainly important eval terms, as are tempo, and perhaps controlled terrain, or at least reserved moves in a closed game, but I have not been using these for move ordering. My point was that I tried LMR for draughts and it did not work nearly as well as the scheme that I was already using for reductions. It's always difficult to know for sure why something 'works' or doesn't, but my suspicion about LMR was that it works better in chess due to the better move ordering info in that game.

I never tried a null move in draughts and just assumed that it could not work because of the nature of the game. Do you have reference for the double move trick by Vincent D.?

-- Ed

http://www.talkchess.com/forum/viewtopi ... 03&t=29176
http://talkchess.com/forum/viewtopic.ph ... e+checkers

At every node, after searching the TT move, but before any other move, you first try a null move (i.e. simply passing the right to move) at a reduced depth (R = 3 typically, sometimes R=2 at low depths). If the reduced search does not fail high, then you keep searching all other moves. Null moves in chess can not be made when in check. In draughts, you cannot make a null move when you have to capture.

Null moves are made recursively in the entire search. Some engines only try it if the static (or lazy) eval is near beta. Others try it everywhere. The only exception is that you can at most have 2 null moves in a row. This will take care of immediate zugzwangs. E.g. if all pieces are in direct opposition, a variation like [null null something] will immediately lose material and fail low. This means that the second null move fails high and the first null move fails low and an incorrect pruning is avoided. Note that the minimal search depth to detect an immediate zugzwang with a double null move is 2 R + 3. Distant zugzwangs are more difficult to detect of course, but the same holds for a regular search (albeit to a lesser degree).

Captures are a bit subtle. First, you cannot null move when you yourself have to capture. But what if your opponent threatens to capture? If you null move, he captures 1 piece and you recapture (say) 2 or more pieces, it looks like you win material. But how do you know for sure that you have a passing move? So one should allow an exception: you cannot make a null move when you have to capture, unless it is a response to a previous null move (to let the opponent first prove that he can let you capture). An alternative way would be to forbid making a null move when the opponent threatens a capture but to first search a normal move at depth = d - 2 R -2 (so still making the double null move reduction, but really doing a null move + a normal capture move) to see if you can safely let the opponent capture.

Of course, there's no guarantee that it will work. Perhaps zugzwangs are too prevalant or too deep to detect efficiently in draughts. I'll try it out though!

Rein

TAILLE · Post by **TAILLE** » Mon Jul 05, 2010 15:20

Ed Gilbert wrote:This forum has been awfully quiet lately! Where did everyone go? Watching world cup I imagine

Gerard, what are you up to? Probably riding your bike in the Alps and getting ready to follow the Tour. Have you made any progress solving the parallel search problem in Damy?

-- Ed

Hi Ed.
Yes Ed. I am still working hard on my parallel search algorithm and also on a parallel generation process in order to be able to begin the 8 pieces database generation. Now I am able to genarate the 6 pieces database in about 4 hours with two quad processors.
In the same time I am completly changing my endgame database format which was not optimised to work with the MTD_f based algorithm I use in my engine. The idea is the following : when you use an MTD-f algorithm and you reach a position that is in the endgame database you do not need to have the exact value (loss, draw, win) of the position. If for example the test value is greater then 0 then you only need to know if the result is a win. If not it does not matter if the result is a draw or a loss.
My new 6 pieces endgame database is now made of two sub databases in order to answer the two following questions :
1) Is the concerned position a winning position ?
2) Is the concerned position a losing position ?
What are the consequences ? I have an extension of the db on the disk (about 20%) but, when reading the disk I avoid to store in memory a great amount of unuseful information. Because it does not harm to have this 20% extension on the disk I suspect it is a good improvment.
Another advantage is the following : it must be possible to store in memory all the losing positions of the 7 pieces endgames db and avoid any disk access during a real game.

MichelG · Post by **MichelG** » Tue Jul 06, 2010 12:46

TAILLE wrote: Hi Ed.
Yes Ed. I am still working hard on my parallel search algorithm and also on a parallel generation process in order to be able to begin the 8 pieces database generation. Now I am able to genarate the 6 pieces database in about 4 hours with two quad processors.

Is that 4 hours for the complete (3v3, 4v2, 5v1) databases? That's quite impressive!

I am also working creating the 8 piece databases. The code runs parallel, but it needs some work on running faster. Did you make algorithmic improvements, or did you just optimize your existing code?

Michel

TAILLE · Post by **TAILLE** » Tue Jul 06, 2010 14:08

MichelG wrote:
TAILLE wrote: Hi Ed.
Yes Ed. I am still working hard on my parallel search algorithm and also on a parallel generation process in order to be able to begin the 8 pieces database generation. Now I am able to genarate the 6 pieces database in about 4 hours with two quad processors.
Is that 4 hours for the complete (3v3, 4v2, 5v1) databases? That's quite impressive!

I am also working creating the 8 piece databases. The code runs parallel, but it needs some work on running faster. Did you make algorithmic improvements, or did you just optimize your existing code?

Michel

Yes Michel I need 4 hours for genarating the complete (3v3, 4v2, 5v1) db. This includes the compression phase but not the verification process. Verification is another problem. By the way and I need far more time for verification than for building the db. For the 8 db I intend to limit a bit more the verification process.

What is your approach for this verification phase ?

Rein Halbersma · Post by **Rein Halbersma** » Tue Jul 06, 2010 14:42

TAILLE wrote:
MichelG wrote:
TAILLE wrote: Hi Ed.
Yes Ed. I am still working hard on my parallel search algorithm and also on a parallel generation process in order to be able to begin the 8 pieces database generation. Now I am able to genarate the 6 pieces database in about 4 hours with two quad processors.
Is that 4 hours for the complete (3v3, 4v2, 5v1) databases? That's quite impressive!

I am also working creating the 8 piece databases. The code runs parallel, but it needs some work on running faster. Did you make algorithmic improvements, or did you just optimize your existing code?

Michel
Yes Michel I need 4 hours for genarating the complete (3v3, 4v2, 5v1) db. This includes the compression phase but not the verification process. Verification is another problem. By the way and I need far more time for verification than for building the db. For the 8 db I intend to limit a bit more the verification process.

What is your approach for this verification phase ?

Why don't you build the databases twice and check whether they give the same results? That and the comparison with already published results by Ed and others should make sure your results are correct.

TAILLE · Post by **TAILLE** » Tue Jul 06, 2010 15:37

Hi Rein,

Rein Halbersma wrote: Why don't you build the databases twice and check whether they give the same results? That and the comparison with already published results by Ed and others should make sure your results are correct.

That's not so simple. When I rebuilt for the first time my 2,3,4,5 and 6 pieces db with my new multithread algorithm I was not able to discover the inconsistencies of my db : after each new db built the verification process was OK and all the results seemed identical to my "old" db and to the already results published.
Fortunetly I ran again a complete verification of the db and I discovered a bug that disturbed some indexes of the 4 pieces db will building the 6 pieces db. The problem is now fixed and from that experience I intend now to build the complete 7 pieces db before running a complete verification of the 2-7 db.
For the 8 pieces db I will have the same approach : I intend to generate first the complete db, without any verification other than comparing the number of loss, draw, win with Ed, and then I will run a (probably partial) verification process.

Ed Gilbert · Post by **Ed Gilbert** » Tue Jul 06, 2010 22:51

It is nice to hear that others are working on the 8pc database. I also heard from Gijsbert Wiesenekker recently that he is building the distance to win database for 7 pieces.

I would be interested to hear how others are doing the concurrency. I subdivided the 8pc db into hundreds of thousands of small pieces which at any instant in time typically allowed 100 or more subdivisions to be built in parallel. Schaeffer did not subdivide nearly as much but instead broke the build process up into smaller steps -- capture pass, conversion pass, non-conversion passes, compression, and verification, and used a work queue that could hand out one of these tasks to an available computer on the network. My build program is single threaded as once it gets the information about the next subdivision to build from the master of the cluster it can do all those steps unassisted, and then send the finished compressed result back to the master and move on to the next one. Are you running many instances of one build program as I did, or is it one program with many threads? How do you subdivide the work?

-- Ed

MichelG · Post by **MichelG** » Wed Jul 07, 2010 13:02

Dragon runs as a single, multithreaded application. I try i have it work on as big as possible databases that still fit in memory. Currently the databases are split according to the number of pieces, like 1210 for a database with 1 man, 2 crown and 1 black man.

For databases with 7 pieces, these are subdivided according to the rank of the most advanced man (splitting into maximal 9*9=81 subslices). And with 8 pieces i subdivide according to the position of the most advanced piece (45*45 slices)

At each iteration all cores work on the same database. I think that works quite well, because each of the passes through the database can be multithreaded quite efficiently

The only disadvantage is that you can't easily run it concurrently on several computers, but it is possible manually by generating 5 vs 3 in parallel with 4 vs 4 pieces.

I didn't parallelize the compression phase though as it still runs fast enough to be not much of a problem.

One problem that dragon still has, is that it algorithm is intended for computing distances to win instead of only win/draw/lose, i think that is slowing it down a bit, and i am not quite happy about the speed it runs (5 piece database build takes 45 minutes on a q6600 or about 30 minutes on a i860)

I also heard from Gijsbert Wiesenekker recently that he is building the distance to win database for 7 pieces.

I already did a (partial) calculation of these. A slight complication is that some of the positions take 256 ply to solve which means you can't fit a position in a byte. Dragon uses 10 bits/position now to handle that.

You say that verifying takes more time than the original computing. For me, i just check for consistency within the database and that takes about half the time of the build. It isn't a true verification because it wouldn't find errors in for example the move generator, but it does find (virtually all) hardware errors. Other than that, running the program twice would be the only option to make absolutely sure there are no hardware errors.

Michel

TAILLE · Post by **TAILLE** » Wed Jul 07, 2010 16:43

MichelG wrote:Dragon runs as a single, multithreaded application. I try i have it work on as big as possible databases that still fit in memory. Currently the databases are split according to the number of pieces, like 1210 for a database with 1 man, 2 crown and 1 black man.

For databases with 7 pieces, these are subdivided according to the rank of the most advanced man (splitting into maximal 9*9=81 subslices). And with 8 pieces i subdivide according to the position of the most advanced piece (45*45 slices)

At each iteration all cores work on the same database. I think that works quite well, because each of the passes through the database can be multithreaded quite efficiently The only disadvantage is that you can't easily run it concurrently on several computers, but it is possible manually by generating 5 vs 3 in parallel with 4 vs 4 pieces.

I didn't parallelize the compression phase though as it still runs fast enough to be not much of a problem.

One problem that dragon still has, is that it algorithm is intended for computing distances to win instead of only win/draw/lose, i think that is slowing it down a bit, and i am not quite happy about the speed it runs (5 piece database build takes 45 minutes on a q6600 or about 30 minutes on a i860)

I also heard from Gijsbert Wiesenekker recently that he is building the distance to win database for 7 pieces.
I already did a (partial) calculation of these. A slight complication is that some of the positions take 256 ply to solve which means you can't fit a position in a byte. Dragon uses 10 bits/position now to handle that.

You say that verifying takes more time than the original computing. For me, i just check for consistency within the database and that takes about half the time of the build. It isn't a true verification because it wouldn't find errors in for example the move generator, but it does find (virtually all) hardware errors. Other than that, running the program twice would be the only option to make absolutely sure there are no hardware errors.

Michel

Like Dragon Damy runs a single, multithreaded application, all cores working on the same database. Subdivisions for Damy are different. For a db without men Damy create 100 subdivisoins based on the position indexes. For db with men the subdivisions are based on the men configuration of one side.
Generation time with two quad i7, Damy using 7 cores:
complete 5 pieces db : 5'
complete 6 pieces db : 4h
0025 db (bm, wm, bk, wk) : 35' (without symmetry consideration)
0124 db : 2h 50'

World Draughts Forum

Endgame database benchmarks

Re: Endgame database benchmarks

Hola

Re: Hola

Re: Endgame database benchmarks

Re: Endgame database benchmarks

Re: Endgame database benchmarks

Re: Endgame database benchmarks

Re: Hola

Re: Hola

Re: Hola

Re: Hola

Re: Hola

Re: Endgame database benchmarks

Re: Endgame database benchmarks

Re: Endgame database benchmarks