
Endgame database generator source code
-
- Posts: 860
- Joined: Sat Apr 28, 2007 14:53
- Real name: Ed Gilbert
- Location: Morristown, NJ USA
- Contact:
Endgame database generator source code
Martin Fierz has published the source code for the program he used to build an 8-piece endgame database for English checkers. You can find the link at his CheckerBoard web site here: http://www.fierz.ch/checkers.htm. If you are interested in learning how this works, reading his code can help.
-- Ed
-- Ed
Re: Endgame database generator source code
Hi Ed,
By the way I find in the code the following routine :
int bitcount(int32 n)
{
return (bitsinword[n&0x0000FFFF]+bitsinword[(n>>16)&0x0000FFFF]);
}
Do you think it is that interesting in case 64bits word ? Do you adress 4 times this table with 65536 entries ?
For you information I use the following routine :
short bitcount(INT64 x)
{
x -= (x >> 1) & 0x5555555555555555;
x = (x & 0x3333333333333333) + ((x >> 2) & 0x3333333333333333);
x = (x + (x >> 4)) & 0xf0f0f0f0f0f0f0f;
x = (x * 0x101010101010101) >> 56;
return (short) ((UINT8)x);
}
I do not know which is the best but in any case I like all these kinds of idea.
Gérard
Very interesting indeed and my approach is quite identical.Ed Gilbert wrote:Martin Fierz has published the source code for the program he used to build an 8-piece endgame database for English checkers. You can find the link at his CheckerBoard web site here: http://www.fierz.ch/checkers.htm. If you are interested in learning how this works, reading his code can help.
-- Ed
By the way I find in the code the following routine :
int bitcount(int32 n)
{
return (bitsinword[n&0x0000FFFF]+bitsinword[(n>>16)&0x0000FFFF]);
}
Do you think it is that interesting in case 64bits word ? Do you adress 4 times this table with 65536 entries ?
For you information I use the following routine :
short bitcount(INT64 x)
{
x -= (x >> 1) & 0x5555555555555555;
x = (x & 0x3333333333333333) + ((x >> 2) & 0x3333333333333333);
x = (x + (x >> 4)) & 0xf0f0f0f0f0f0f0f;
x = (x * 0x101010101010101) >> 56;
return (short) ((UINT8)x);
}
I do not know which is the best but in any case I like all these kinds of idea.
Gérard
-
- Posts: 1722
- Joined: Wed Apr 14, 2004 16:04
- Contact:
Re: Endgame database generator source code
Hi Gérard,TAILLE wrote:Hi Ed,
Very interesting indeed and my approach is quite identical.Ed Gilbert wrote:Martin Fierz has published the source code for the program he used to build an 8-piece endgame database for English checkers. You can find the link at his CheckerBoard web site here: http://www.fierz.ch/checkers.htm. If you are interested in learning how this works, reading his code can help.
-- Ed
By the way I find in the code the following routine :Do you think it is that interesting in case 64bits word ? Do you adress 4 times this table with 65536 entries ?Code: Select all
int bitcount(int32 n) { return (bitsinword[n&0x0000FFFF]+bitsinword[(n>>16)&0x0000FFFF]); }
For you information I use the following routine :I do not know which is the best but in any case I like all these kinds of idea.Code: Select all
short bitcount(INT64 x) { x -= (x >> 1) & 0x5555555555555555; x = (x & 0x3333333333333333) + ((x >> 2) & 0x3333333333333333); x = (x + (x >> 4)) & 0xf0f0f0f0f0f0f0f; x = (x * 0x101010101010101) >> 56; return (short) ((UINT8)x); }
Gérard
How did you get the idea of your population count? I like the chess programming wiki very much: http://chessprogramming.wikispaces.com/Population+Count
I use a very simple method by simply looping over the board (Brian Kernighan's way)
Code: Select all
// population count of the number of 1-bits
unsigned int pop_count(BitBoard b)
{
unsigned int count = 0;
while (b) {
b &= b - 1; // reset LS1B
++count;
}
return count;
}
Code: Select all
// De Bruijn sequence: all 2^6 = 64 unique 6-bit prefixes occur in its 64-bit binary representation
const BitBoard DE_BRUIJN_SEQUENCE = 0x07edd5e59a4e28c2;
// the entry of this table indexed by the [6-bit prefix of 2^i * De Bruijn sequence] is equal to [i]
const unsigned int BIT_TABLE[NUM_BITS] = {
63, 0, 58, 1, 59, 47, 53, 2,
60, 39, 48, 27, 54, 33, 42, 3,
61, 51, 37, 40, 49, 18, 28, 20,
55, 30, 34, 11, 43, 14, 22, 4,
62, 57, 46, 52, 38, 26, 32, 41,
50, 36, 17, 19, 29, 10, 13, 21,
56, 45, 25, 31, 35, 16, 9, 12,
44, 24, 15, 8, 23, 7, 6, 5
};
// index of the least significant 1-bit
unsigned int index_ls1b(BitBoard b)
{
b &= (0 - b); // select LS1B
b *= DE_BRUIJN_SEQUENCE;
b >>= 58;
return BIT_TABLE[b];
}
Re: Endgame database generator source code
Hi Rein,
Gérard
Look at http://chessprogramming.wikispaces.com/Population+CountRein Halbersma wrote:How did you get the idea of your population count?
Gérard
-
- Posts: 860
- Joined: Sat Apr 28, 2007 14:53
- Real name: Ed Gilbert
- Location: Morristown, NJ USA
- Contact:
I use the table lookup method of population count. Years ago in my English checkers program I tried some of the other algorithms and found that the table lookup was the fastest. Of course for English checkers I only needed it for a 32-bit bitboard, and at that time I only had 32-bit processors, so it is possible that one of the other methods is now faster. My feeling is that if the table is already in cache then table lookup should be fastest because it requires the fewest number of branch-free instructions. The calls to population count are so numerous in kingsrow that the table is probably always in cache, but I could be wrong. I think Rein's iterative algorithm should be fastest when only a few bits are set, but otherwise it might be slower.
Gerard, do you inline your population count function or call it as a normal function?
-- Ed
Gerard, do you inline your population count function or call it as a normal function?
Is your indexing approach the same as Martin's? He subdivides by the rank of the leading checker, the same as done by Schaeffer. But unlike Schaeffer, Martin's indexing does not eliminate the indices of illegal positions where black and white men are on the same square. Eliminating these in the leading rank indexing requires a somewhat large table with an entry for every configuration of checkers of one color. I like Martin's approach, which is to simply test for these illegal positions during the build, so that their values are don't cares. During runlength compression, these illegal positions all get swept up into the runlengths of adjacent valid positions, so they do not increase the size of the compressed database. In this way they are treated the same as capture positions.Very interesting indeed and my approach is quite identical.
-- Ed
-
- Posts: 1722
- Joined: Wed Apr 14, 2004 16:04
- Contact:
Ed,Ed Gilbert wrote:I use the table lookup method of population count. Years ago in my English checkers program I tried some of the other algorithms and found that the table lookup was the fastest. Of course for English checkers I only needed it for a 32-bit bitboard, and at that time I only had 32-bit processors, so it is possible that one of the other methods is now faster. My feeling is that if the table is already in cache then table lookup should be fastest because it requires the fewest number of branch-free instructions. The calls to population count are so numerous in kingsrow that the table is probably always in cache, but I could be wrong. I think Rein's iterative algorithm should be fastest when only a few bits are set, but otherwise it might be slower.
Gerard, do you inline your population count function or call it as a normal function?
-- Ed
I think your way is faster for your program as you (similar to Bert and Feike, I think) generate successors as moves, rather than differences in position as I do. Your eval needs a lot of population counts over dense bitboards. At least for my material updates, making/undoing the material counts of a move is a population count over a sparse bitboard. Perhaps I should have 2 versions: one for such sparse pop counts and one for boards for which I know the number of bits is large.
Did you also time the difference between various methods for lookup of the lsb?
Rein
No but it is possible that the compiler make the job. I have to verify this point. In any case I know I can improve my program in several places.Ed Gilbert wrote: Gerard, do you inline your population count function or call it as a normal function?
The idea is similar but the realisation is different. Let's take the following example. I would like to build the 4k aganist 1k2m. With the 2 men I can build 45x44/2 = 990 positions. I can arrange the number n of subslices exactly as I want on condition that 990 mod n = 0. In this example I can choose between, 1, 2, 3, 4, 5, 6, 9, 11 ..., 330, 495, 990 subslices.Ed Gilbert wrote: Is your indexing approach the same as Martin's? He subdivides by the rank of the leading checker, the same as done by Schaeffer.
In the current version I do not eliminate these illegal positions. Furthermore I do not eliminate either the positions where a white king is on the same square of a black king !Ed Gilbert wrote: But unlike Schaeffer, Martin's indexing does not eliminate the indices of illegal positions where black and white men are on the same square.
For your information I am thinking about a new approach in which I do not need to use the "indextoposition" routine but only the "positiontoindex" routine.
Gérard
-
- Posts: 1722
- Joined: Wed Apr 14, 2004 16:04
- Contact:
In my current program I also don't need it. I just have it as a general routine in case I would ever need it. One example would be to have a lookup table with for each square a bitboard of available squares. The chess folks sometimes use this method to generate sliding piece moves.TAILLE wrote:I Rein,
I never need to calculate the index of the lsb. I only need to generate the corresponding mask i.e. mask = x & (-x).Rein Halbersma wrote:Did you also time the difference between various methods for lookup of the lsb?
Rein
Gérard
Currently I am working on a method to compute the terrain controlled by both pieces. This will be one of the features in the evaluation function (it is currently very simple: only material, mobility, balance and tempo). It also uses several bitboard tricks (look for "flood fill" in the chess wiki).
Rein
Last edited by Rein Halbersma on Thu Oct 30, 2008 00:59, edited 2 times in total.
-
- Posts: 860
- Joined: Sat Apr 28, 2007 14:53
- Real name: Ed Gilbert
- Location: Morristown, NJ USA
- Contact:
Hi Gerard,The idea is similar but the realisation is different. Let's take the following example. I would like to build the 4k aganist 1k2m. With the 2 men I can build 45x44/2 = 990 positions. I can arrange the number n of subslices exactly as I want on condition that 990 mod n = 0. In this example I can choose between, 1, 2, 3, 4, 5, 6, 9, 11 ..., 330, 495, 990 subslices.
I suspect that we are doing the same thing, but I'm not sure because I don't quite understand how you pick the number of subslices. Let me explain what I do, and then maybe you can tell me how yours is different. I think you know how my indexing works, with subdivisions by piece counts, rank of the leading checker, the configuration of checkers on the leading rank, and the rank of the next most advanced checkers. But these subdivisions are only for indexing, file I/O, and caching, not for building. For building, every unique configuration of black men and white men gets built separately. So, for your example of 4k vs 1k2m, I call the build function 990 times, each time it goes through the complete build process -- black to move and white to move conversion passes and multiple non-conversion passes. My indexing function is arranged so that the the indicies move in the smallest steps by moving the kings around, and make the largest steps by moving the men around. Pseudocode for building a subdivision looks like this:
Code: Select all
for (end_index = subdb->dbsize -1; end_index >= 0; end_index -= subdb->kingrange) {
start_index = end_index + 1 - subdb->kingrange;
build_segment(subdb, start_index, end_index, ...)
}
One nice property of treating every man configuration as a different build is that all man moves are conversion moves during the build, so I have special functions that only build man movelists for the conversion passes, and build king movelists for the non-conversion passes, which saves some time. Each call to build_segment builds the smallest possible range of positions that are possible, which usually helps processor caching, except in the case of subdivisions which are all men. That is a degenerate case where there are millions of calls to build_subdivision, and each one only resolves one position :-( Those are not so efficient, but even those are not terribly slow, so overall the process works well.
-- Ed
-
- Posts: 860
- Joined: Sat Apr 28, 2007 14:53
- Real name: Ed Gilbert
- Location: Morristown, NJ USA
- Contact:
Does this mean that you don't use history tables in Damy for move ordering? The usual method is to index the table like HistoryTable[from_square][to_square], so you need some way to get the square numbers from the masks.TAILLE wrote:I never need to calculate the index of the lsb. I only need to generate the corresponding mask i.e. mask = x & (-x).
Gérard
-- Ed
-
- Posts: 860
- Joined: Sat Apr 28, 2007 14:53
- Real name: Ed Gilbert
- Location: Morristown, NJ USA
- Contact:
Eliminating the positions where men interfere with men is costly, but getting rid of kings interfering with other kings or men is relatively easy. In your case though, since you do not use runlength compression but only store the positions for which you have valid values, I guess this does not matter.In the current version I do not eliminate these illegal positions. Furthermore I do not eliminate either the positions where a white king is on the same square of a black king !
Yes, I think you can use an 'iterator' to move sequentially through all the positions, and it should be a little faster than indextoposition(). It makes the code a little messier though, and it means that you cannot store indicies when you do reverse non-conversion passes -- instead you have to store positions in your resolved queue instead of indicies. I exchanged emails with a Russian checkers programmer who used this technique. Still I like having these 2 complementary functions, and when I profiled the whole build process the indextoposition function was not one of the bottlenecks.For your information I am thinking about a new approach in which I do not need to use the "indextoposition" routine but only the "positiontoindex" routine.
-- Ed
-
- Posts: 860
- Joined: Sat Apr 28, 2007 14:53
- Real name: Ed Gilbert
- Location: Morristown, NJ USA
- Contact:
Re: Endgame database generator source code
Rein, the x86 architecture has a built-in instruction to do this, and the Microsoft compiler exposes it through an intrisic -- _BitScanForward(). I imagine this must be faster than any table lookup, though I don't know for sure.Rein Halbersma wrote:For finding the index of a single bit I use De Bruijn multiplication ( http://chessprogramming.wikispaces.com/BitScan ) which can be done as follows
-- Ed
A subslice is not so important for me. It corresponds only to the number of positions I can completly resolved in memory, without disk I/O.Ed Gilbert wrote:I don't quite understand how you pick the number of subslices
For storage on the disk the basic unit corresponds to all positions of the kings for a given positions of the men; I build other subslices in order to store positions in block of arround 4Kb.
Gérard
Effectively I do not use a history table as you define it. I only store in the hash table a move leading to a cutoff.Ed Gilbert wrote:Does this mean that you don't use history tables in Damy for move ordering? The usual method is to index the table like HistoryTable[from_square][to_square], so you need some way to get the square numbers from the masks.
-- Ed
Gérard