Endgame database generator source code

Ed Gilbert · Post by **Ed Gilbert** » Mon Oct 27, 2008 23:00

Martin Fierz has published the source code for the program he used to build an 8-piece endgame database for English checkers. You can find the link at his CheckerBoard web site here: http://www.fierz.ch/checkers.htm. If you are interested in learning how this works, reading his code can help.

-- Ed

TAILLE · Post by **TAILLE** » Wed Oct 29, 2008 10:41

Hi Ed,

Ed Gilbert wrote:Martin Fierz has published the source code for the program he used to build an 8-piece endgame database for English checkers. You can find the link at his CheckerBoard web site here: http://www.fierz.ch/checkers.htm. If you are interested in learning how this works, reading his code can help.

-- Ed

Very interesting indeed and my approach is quite identical.

By the way I find in the code the following routine :
int bitcount(int32 n)
{
return (bitsinword[n&0x0000FFFF]+bitsinword[(n>>16)&0x0000FFFF]);
}
Do you think it is that interesting in case 64bits word ? Do you adress 4 times this table with 65536 entries ?

For you information I use the following routine :

short bitcount(INT64 x)
{
x -= (x >> 1) & 0x5555555555555555;
x = (x & 0x3333333333333333) + ((x >> 2) & 0x3333333333333333);
x = (x + (x >> 4)) & 0xf0f0f0f0f0f0f0f;
x = (x * 0x101010101010101) >> 56;
return (short) ((UINT8)x);
}

I do not know which is the best but in any case I like all these kinds of idea.

Gérard

Rein Halbersma · Post by **Rein Halbersma** » Wed Oct 29, 2008 11:42

TAILLE wrote:Hi Ed,

Ed Gilbert wrote:Martin Fierz has published the source code for the program he used to build an 8-piece endgame database for English checkers. You can find the link at his CheckerBoard web site here: http://www.fierz.ch/checkers.htm. If you are interested in learning how this works, reading his code can help.

-- Ed
Very interesting indeed and my approach is quite identical.

By the way I find in the code the following routine :
Code: Select all
int bitcount(int32 n)
{
return (bitsinword[n&0x0000FFFF]+bitsinword[(n>>16)&0x0000FFFF]);
}
Do you think it is that interesting in case 64bits word ? Do you adress 4 times this table with 65536 entries ?

For you information I use the following routine :
Code: Select all
short bitcount(INT64 x)
{
x -= (x >> 1) & 0x5555555555555555;
x = (x & 0x3333333333333333) + ((x >> 2) & 0x3333333333333333);
x = (x + (x >> 4)) & 0xf0f0f0f0f0f0f0f;
x = (x * 0x101010101010101) >> 56;
return (short) ((UINT8)x);
}
I do not know which is the best but in any case I like all these kinds of idea.

Gérard

Hi Gérard,

How did you get the idea of your population count? I like the chess programming wiki very much: http://chessprogramming.wikispaces.com/Population+Count
I use a very simple method by simply looping over the board (Brian Kernighan's way)

Code: Select all

// population count of the number of 1-bits
unsigned int pop_count(BitBoard b) 
{
        unsigned int count = 0;
        while (b) {
                b &= b - 1; // reset LS1B
                ++count;
        }
        return count;
}

For finding the index of a single bit I use De Bruijn multiplication ( http://chessprogramming.wikispaces.com/BitScan ) which can be done as follows

Code: Select all

// De Bruijn sequence: all 2^6 = 64 unique 6-bit prefixes occur in its 64-bit binary representation
const BitBoard DE_BRUIJN_SEQUENCE = 0x07edd5e59a4e28c2;

// the entry of this table indexed by the [6-bit prefix of 2^i * De Bruijn sequence] is equal to [i]
const unsigned int BIT_TABLE[NUM_BITS] = {
        63,  0, 58,  1, 59, 47, 53,  2,
        60, 39, 48, 27, 54, 33, 42,  3,
        61, 51, 37, 40, 49, 18, 28, 20,
        55, 30, 34, 11, 43, 14, 22,  4,
        62, 57, 46, 52, 38, 26, 32, 41,
        50, 36, 17, 19, 29, 10, 13, 21,
        56, 45, 25, 31, 35, 16,  9, 12,
        44, 24, 15,  8, 23,  7,  6,  5
};

// index of the least significant 1-bit
unsigned int index_ls1b(BitBoard b) 
{
        b &= (0 - b); // select LS1B
        b *= DE_BRUIJN_SEQUENCE;
        b >>= 58;        
        return BIT_TABLE[b];
}

Rein

TAILLE · Post by **TAILLE** » Wed Oct 29, 2008 12:13

Hi Rein,

Rein Halbersma wrote:How did you get the idea of your population count?

Look at http://chessprogramming.wikispaces.com/Population+Count
Gérard

Ed Gilbert · Post by **Ed Gilbert** » Wed Oct 29, 2008 13:19

I use the table lookup method of population count. Years ago in my English checkers program I tried some of the other algorithms and found that the table lookup was the fastest. Of course for English checkers I only needed it for a 32-bit bitboard, and at that time I only had 32-bit processors, so it is possible that one of the other methods is now faster. My feeling is that if the table is already in cache then table lookup should be fastest because it requires the fewest number of branch-free instructions. The calls to population count are so numerous in kingsrow that the table is probably always in cache, but I could be wrong. I think Rein's iterative algorithm should be fastest when only a few bits are set, but otherwise it might be slower.

Gerard, do you inline your population count function or call it as a normal function?

Very interesting indeed and my approach is quite identical.

Is your indexing approach the same as Martin's? He subdivides by the rank of the leading checker, the same as done by Schaeffer. But unlike Schaeffer, Martin's indexing does not eliminate the indices of illegal positions where black and white men are on the same square. Eliminating these in the leading rank indexing requires a somewhat large table with an entry for every configuration of checkers of one color. I like Martin's approach, which is to simply test for these illegal positions during the build, so that their values are don't cares. During runlength compression, these illegal positions all get swept up into the runlengths of adjacent valid positions, so they do not increase the size of the compressed database. In this way they are treated the same as capture positions.

-- Ed

Rein Halbersma · Post by **Rein Halbersma** » Wed Oct 29, 2008 13:51

Ed Gilbert wrote:I use the table lookup method of population count. Years ago in my English checkers program I tried some of the other algorithms and found that the table lookup was the fastest. Of course for English checkers I only needed it for a 32-bit bitboard, and at that time I only had 32-bit processors, so it is possible that one of the other methods is now faster. My feeling is that if the table is already in cache then table lookup should be fastest because it requires the fewest number of branch-free instructions. The calls to population count are so numerous in kingsrow that the table is probably always in cache, but I could be wrong. I think Rein's iterative algorithm should be fastest when only a few bits are set, but otherwise it might be slower.

Gerard, do you inline your population count function or call it as a normal function?

-- Ed

Ed,

I think your way is faster for your program as you (similar to Bert and Feike, I think) generate successors as moves, rather than differences in position as I do. Your eval needs a lot of population counts over dense bitboards. At least for my material updates, making/undoing the material counts of a move is a population count over a sparse bitboard. Perhaps I should have 2 versions: one for such sparse pop counts and one for boards for which I know the number of bits is large.

Did you also time the difference between various methods for lookup of the lsb?

Rein

TAILLE · Post by **TAILLE** » Wed Oct 29, 2008 15:10

Ed Gilbert wrote: Gerard, do you inline your population count function or call it as a normal function?

No but it is possible that the compiler make the job. I have to verify this point. In any case I know I can improve my program in several places.

Ed Gilbert wrote: Is your indexing approach the same as Martin's? He subdivides by the rank of the leading checker, the same as done by Schaeffer.

The idea is similar but the realisation is different. Let's take the following example. I would like to build the 4k aganist 1k2m. With the 2 men I can build 45x44/2 = 990 positions. I can arrange the number n of subslices exactly as I want on condition that 990 mod n = 0. In this example I can choose between, 1, 2, 3, 4, 5, 6, 9, 11 ..., 330, 495, 990 subslices.

Ed Gilbert wrote: But unlike Schaeffer, Martin's indexing does not eliminate the indices of illegal positions where black and white men are on the same square.

In the current version I do not eliminate these illegal positions. Furthermore I do not eliminate either the positions where a white king is on the same square of a black king !
For your information I am thinking about a new approach in which I do not need to use the "indextoposition" routine but only the "positiontoindex" routine.

Gérard

TAILLE · Post by **TAILLE** » Wed Oct 29, 2008 15:20

I Rein,

Rein Halbersma wrote:Did you also time the difference between various methods for lookup of the lsb?
Rein

I never need to calculate the index of the lsb. I only need to generate the corresponding mask i.e. mask = x & (-x).

Gérard

Rein Halbersma · Post by **Rein Halbersma** » Wed Oct 29, 2008 15:50

TAILLE wrote:I Rein,

Rein Halbersma wrote:Did you also time the difference between various methods for lookup of the lsb?
Rein
I never need to calculate the index of the lsb. I only need to generate the corresponding mask i.e. mask = x & (-x).

Gérard

In my current program I also don't need it. I just have it as a general routine in case I would ever need it. One example would be to have a lookup table with for each square a bitboard of available squares. The chess folks sometimes use this method to generate sliding piece moves.

Currently I am working on a method to compute the terrain controlled by both pieces. This will be one of the features in the evaluation function (it is currently very simple: only material, mobility, balance and tempo). It also uses several bitboard tricks (look for "flood fill" in the chess wiki).

Rein

Ed Gilbert · Post by **Ed Gilbert** » Wed Oct 29, 2008 22:42

The idea is similar but the realisation is different. Let's take the following example. I would like to build the 4k aganist 1k2m. With the 2 men I can build 45x44/2 = 990 positions. I can arrange the number n of subslices exactly as I want on condition that 990 mod n = 0. In this example I can choose between, 1, 2, 3, 4, 5, 6, 9, 11 ..., 330, 495, 990 subslices.

Hi Gerard,

I suspect that we are doing the same thing, but I'm not sure because I don't quite understand how you pick the number of subslices. Let me explain what I do, and then maybe you can tell me how yours is different. I think you know how my indexing works, with subdivisions by piece counts, rank of the leading checker, the configuration of checkers on the leading rank, and the rank of the next most advanced checkers. But these subdivisions are only for indexing, file I/O, and caching, not for building. For building, every unique configuration of black men and white men gets built separately. So, for your example of 4k vs 1k2m, I call the build function 990 times, each time it goes through the complete build process -- black to move and white to move conversion passes and multiple non-conversion passes. My indexing function is arranged so that the the indicies move in the smallest steps by moving the kings around, and make the largest steps by moving the men around. Pseudocode for building a subdivision looks like this:

Code: Select all

for (end_index = subdb->dbsize -1; end_index >= 0; end_index -= subdb->kingrange) {
    start_index = end_index + 1 - subdb->kingrange;
    build_segment(subdb, start_index, end_index, ...)
}

Compression and verification occurs after dropping out of this loop, when all the positions from 0..dbsize - 1 have been resolved. Then the entire subdivision of indices ranging from 0..dbsize - 1 gets compressed and verified as one lump object.

One nice property of treating every man configuration as a different build is that all man moves are conversion moves during the build, so I have special functions that only build man movelists for the conversion passes, and build king movelists for the non-conversion passes, which saves some time. Each call to build_segment builds the smallest possible range of positions that are possible, which usually helps processor caching, except in the case of subdivisions which are all men. That is a degenerate case where there are millions of calls to build_subdivision, and each one only resolves one position :-( Those are not so efficient, but even those are not terribly slow, so overall the process works well.

-- Ed

Ed Gilbert · Post by **Ed Gilbert** » Wed Oct 29, 2008 22:58

TAILLE wrote:I never need to calculate the index of the lsb. I only need to generate the corresponding mask i.e. mask = x & (-x).
Gérard

Does this mean that you don't use history tables in Damy for move ordering? The usual method is to index the table like HistoryTable[from_square][to_square], so you need some way to get the square numbers from the masks.

-- Ed

Ed Gilbert · Post by **Ed Gilbert** » Wed Oct 29, 2008 23:08

In the current version I do not eliminate these illegal positions. Furthermore I do not eliminate either the positions where a white king is on the same square of a black king !

Eliminating the positions where men interfere with men is costly, but getting rid of kings interfering with other kings or men is relatively easy. In your case though, since you do not use runlength compression but only store the positions for which you have valid values, I guess this does not matter.

For your information I am thinking about a new approach in which I do not need to use the "indextoposition" routine but only the "positiontoindex" routine.

Yes, I think you can use an 'iterator' to move sequentially through all the positions, and it should be a little faster than indextoposition(). It makes the code a little messier though, and it means that you cannot store indicies when you do reverse non-conversion passes -- instead you have to store positions in your resolved queue instead of indicies. I exchanged emails with a Russian checkers programmer who used this technique. Still I like having these 2 complementary functions, and when I profiled the whole build process the indextoposition function was not one of the bottlenecks.

-- Ed

Ed Gilbert · Post by **Ed Gilbert** » Wed Oct 29, 2008 23:14

Rein Halbersma wrote:For finding the index of a single bit I use De Bruijn multiplication ( http://chessprogramming.wikispaces.com/BitScan ) which can be done as follows

Rein, the x86 architecture has a built-in instruction to do this, and the Microsoft compiler exposes it through an intrisic -- _BitScanForward(). I imagine this must be faster than any table lookup, though I don't know for sure.

-- Ed

TAILLE · Post by **TAILLE** » Wed Oct 29, 2008 23:29

Ed Gilbert wrote:I don't quite understand how you pick the number of subslices

A subslice is not so important for me. It corresponds only to the number of positions I can completly resolved in memory, without disk I/O.
For storage on the disk the basic unit corresponds to all positions of the kings for a given positions of the men; I build other subslices in order to store positions in block of arround 4Kb.
Gérard

TAILLE · Post by **TAILLE** » Wed Oct 29, 2008 23:33

Ed Gilbert wrote:Does this mean that you don't use history tables in Damy for move ordering? The usual method is to index the table like HistoryTable[from_square][to_square], so you need some way to get the square numbers from the masks.
-- Ed

Effectively I do not use a history table as you define it. I only store in the hash table a move leading to a cutoff.
Gérard

World Draughts Forum

Endgame database generator source code

Endgame database generator source code

Re: Endgame database generator source code

Re: Endgame database generator source code

Re: Endgame database generator source code

Re: Endgame database generator source code