
BitBoards
Ed, your implementation of the hash-Function sounds very interesting!
What I do (as most programmers i guess) is to assign to every board position with an associated specific piece a 64bit hash-value, and then xor all.
Your approach to already use the 64 bitboards shuffle them (shifts) and also perform an xor operation (if i understood well) , sounds far more intelligent if you get a homogeneous distribution over the hash-space in the end.
Also in this way you don't need to know the from to values of the piece being moved, or the values of the captures pieces.
Without providing details, but could you rewrite your method so that it would enable incremental update?
I tried to find some background info in the computer chess world, but did not find any reference so far.
Is your idea completely new, or already used elsewhere?
Anyway, sounds really interesting to me !!!!
Bert
What I do (as most programmers i guess) is to assign to every board position with an associated specific piece a 64bit hash-value, and then xor all.
Your approach to already use the 64 bitboards shuffle them (shifts) and also perform an xor operation (if i understood well) , sounds far more intelligent if you get a homogeneous distribution over the hash-space in the end.
Also in this way you don't need to know the from to values of the piece being moved, or the values of the captures pieces.
Without providing details, but could you rewrite your method so that it would enable incremental update?
I tried to find some background info in the computer chess world, but did not find any reference so far.
Is your idea completely new, or already used elsewhere?
Anyway, sounds really interesting to me !!!!
Bert
Based upon all feedback so far, Im thinking if it is possible to fully use bitboards in the MoveGenerator and the Search, without the need to "get" the index of the specific bitnumber.
As Ed has pointed out , by using a different Hash-function, you could "more or less" directed use "modified" bitboards (with Zobrist Hashing you need to have the index).
In my Program I also need to use the Move in the HashTable, to find the best move. I have encoded the move in 16 bits to cover the Move-Type (White, Black, Man, King, Normal or Capture Move) and the From and To-Position.
Next to that (to double check if the position is the right one), I included in the Hash-Data (which in my case is 64Bits), the index in the movelist, so I can check if the move is the same.
Maybe this is double work, and to rely on the index in the move-list is sufficient (have to test the number of errors).
Anyway the program will not crash if I take the wrong move, i guess.
If I omit the Move in the Hash-Data, then there is another occurence solved where I need to derive the From and To from the bitboard.
So my question, what are the others doing in their program, do they include a move-structure in the Hash-Data or an index.
Bert
As Ed has pointed out , by using a different Hash-function, you could "more or less" directed use "modified" bitboards (with Zobrist Hashing you need to have the index).
In my Program I also need to use the Move in the HashTable, to find the best move. I have encoded the move in 16 bits to cover the Move-Type (White, Black, Man, King, Normal or Capture Move) and the From and To-Position.
Next to that (to double check if the position is the right one), I included in the Hash-Data (which in my case is 64Bits), the index in the movelist, so I can check if the move is the same.
Maybe this is double work, and to rely on the index in the move-list is sufficient (have to test the number of errors).
Anyway the program will not crash if I take the wrong move, i guess.
If I omit the Move in the Hash-Data, then there is another occurence solved where I need to derive the From and To from the bitboard.
So my question, what are the others doing in their program, do they include a move-structure in the Hash-Data or an index.
Bert
-
- Posts: 860
- Joined: Sat Apr 28, 2007 14:53
- Real name: Ed Gilbert
- Location: Morristown, NJ USA
- Contact:
Bert,
My hashing function is nothing new, but it is much different than Zobrist's. You cannot update it incrementally, so I rely on the fact that it is fast enough to fully calculate each time it is needed. I designed and tested it using the information at several web sites. These two in particular were helpful: http://burtleburtle.net/bob/hash/ and http://bretm.home.comcast.net/~bretm/hash/
I tested the function for randomness using a chi squared bucket test, and for avalanche using several different testing functions. The randomness test uses the hashing function to fill a hashtable using a large number of test positions and then looks at the counts in each bucket and compares them to the expected result for a truly uniform random distribution. The avalanche test checks that a change in each individual bit of the input keys (bitboards) results in a change in every output bit of the resulting hashcode with probability 1/2. I built a test harness that let me sweep many of the parameters of the hashing function while running it through a quick version these tests. The most promising ones were saved and then tested more thoroughly.
To benchmark the overhead the hash function has on the search, I temporarily removed half of the shift/xor cycles and saw a 1% increase in search speed. From this I concluded that if the hashing function took zero time it would only speed up the search by 2%. Removing half of the mixing steps means that it does not do as good a job of randomly distributing the hashcodes, but even with half of it missing the probability of 2 positions resulting in the same hashcode is very small, and the searches looked the same as with the full hashing function except for the 1% speed increase.
-- Ed
My hashing function is nothing new, but it is much different than Zobrist's. You cannot update it incrementally, so I rely on the fact that it is fast enough to fully calculate each time it is needed. I designed and tested it using the information at several web sites. These two in particular were helpful: http://burtleburtle.net/bob/hash/ and http://bretm.home.comcast.net/~bretm/hash/
I tested the function for randomness using a chi squared bucket test, and for avalanche using several different testing functions. The randomness test uses the hashing function to fill a hashtable using a large number of test positions and then looks at the counts in each bucket and compares them to the expected result for a truly uniform random distribution. The avalanche test checks that a change in each individual bit of the input keys (bitboards) results in a change in every output bit of the resulting hashcode with probability 1/2. I built a test harness that let me sweep many of the parameters of the hashing function while running it through a quick version these tests. The most promising ones were saved and then tested more thoroughly.
To benchmark the overhead the hash function has on the search, I temporarily removed half of the shift/xor cycles and saw a 1% increase in search speed. From this I concluded that if the hashing function took zero time it would only speed up the search by 2%. Removing half of the mixing steps means that it does not do as good a job of randomly distributing the hashcodes, but even with half of it missing the probability of 2 positions resulting in the same hashcode is very small, and the searches looked the same as with the full hashing function except for the 1% speed increase.
-- Ed
-
- Posts: 860
- Joined: Sat Apr 28, 2007 14:53
- Real name: Ed Gilbert
- Location: Morristown, NJ USA
- Contact:
Since the x86 architecture has a single instruction for the _BitScanForward intrinsic, getting the bit number from a bitmask shouldn't be a speed issue, should it? I'm assuming this instruction is not horribly slow, but I didn't look up how many clock cycles it takes.BertTuyt wrote:Based upon all feedback so far, Im thinking if it is possible to fully use bitboards in the MoveGenerator and the Search, without the need to "get" the index of the specific bitnumber.
I store only the move index. Hyatt has a paper on the crafty web site which gives some test results where he increased the probability that two different positions hash to the same value, and then he observed the affect on the search. The conclusion was that the number of collisions has to get really gross before you are likely to see any affects on the search score or best move.So my question, what are the others doing in their program, do they include a move-structure in the Hash-Data or an index.
-- Ed
I am just rewriting this part of my program. My idea is simply to put in the hashtable one bitboard (64bits) to represent the expected best move, where each 1 represents a square that changes in the position. If this move is sufficient to provoque a cutoff then you can see that I do not need to call the move generator for that node.BertTuyt wrote: So my question, what are the others doing in their program, do they include a move-structure in the Hash-Data or an index.
Bert
In addition, when a move is unique (this is typically the case of a lot of captures) I keep this information in the hastable and here again I do not need to call the move generator. The probability to do an impossible move due to hash collision exists but seems to me acceptable.
Gérard
Gerard,
i like your idea (and also thought about this).
Especially the need to test a node without a MoveGenerator call
However in the case of Damage I use a 64bit HashKey and 64Bit HashData.
In this 64bit i packed all the required info.
As I use a central/shared hashtable for all threads (4 in case of my Quad), I used a Hyatt implementation where I xor the data with the HashKey and store that result in the HashData.
This way (see Hyatt article) you avoid a expensive lock-mechanism.
I did not study his paper in detail if it also works for 2 64-bits words as Data (at least I guess you use 2 or more 64-bit words).
Do you also have a shared hash-table without locking, or did you solve it in another way?
Bert
i like your idea (and also thought about this).
Especially the need to test a node without a MoveGenerator call
However in the case of Damage I use a 64bit HashKey and 64Bit HashData.
In this 64bit i packed all the required info.
As I use a central/shared hashtable for all threads (4 in case of my Quad), I used a Hyatt implementation where I xor the data with the HashKey and store that result in the HashData.
This way (see Hyatt article) you avoid a expensive lock-mechanism.
I did not study his paper in detail if it also works for 2 64-bits words as Data (at least I guess you use 2 or more 64-bit words).
Do you also have a shared hash-table without locking, or did you solve it in another way?
Bert
Hi Bert,BertTuyt wrote:Gerard,
i like your idea (and also thought about this).
Especially the need to test a node without a MoveGenerator call
However in the case of Damage I use a 64bit HashKey and 64Bit HashData.
In this 64bit i packed all the required info.
As I use a central/shared hashtable for all threads (4 in case of my Quad), I used a Hyatt implementation where I xor the data with the HashKey and store that result in the HashData.
This way (see Hyatt article) you avoid a expensive lock-mechanism.
I did not study his paper in detail if it also works for 2 64-bits words as Data (at least I guess you use 2 or more 64-bit words).
Do you also have a shared hash-table without locking, or did you solve it in another way?
Bert
I also have a shared hash-table without looking. My approach seem very different from yours because my hash structure is far bigger. I use 5x64bits per entry. One for the value of the hashkey, one for a checksum of this hash structure (in order to avoid a lock mechanism with the same idea you can find in Hyatt article), one for the expected best move (as I said before) and 2 others 64 bits to store various informations (evaluations, analysis depth and some flags like capture/no capture etc.).
I hope these information will help you.
Gérard
I have now totally changed my MoveGenerator based on suggestions from several people here on this forum.
The major change is that i nowhere now need the index of the specific bit in all routines.
Also i changed the output of the move format which now only contains a MoveCode, FROM-TO Bitboard (in most cases only 2 bits set), and a Kill/Capture Bitboard.
I still look into the details for further optimization regarding the number of Bitboards I used to describe a position.
I now still use 5 Bitboards, but I know 3 should be sufficient ( White Pieces, Black Pieces, and Kings), all other Bitboards can be derived from this.
But as stated before I don't know if this approach is faster then always update all Bitboards.
In the Perft blog you see now the timing for the 3 test positions used so far. Hope some-one will beat my time ( with a Q6000, as with a I7 this is straightforward), and that we all will learn from this (Note: I thought about a parallel Perft but did not go into this solution).
Bert
The major change is that i nowhere now need the index of the specific bit in all routines.
Also i changed the output of the move format which now only contains a MoveCode, FROM-TO Bitboard (in most cases only 2 bits set), and a Kill/Capture Bitboard.
I still look into the details for further optimization regarding the number of Bitboards I used to describe a position.
I now still use 5 Bitboards, but I know 3 should be sufficient ( White Pieces, Black Pieces, and Kings), all other Bitboards can be derived from this.
But as stated before I don't know if this approach is faster then always update all Bitboards.
In the Perft blog you see now the timing for the 3 test positions used so far. Hope some-one will beat my time ( with a Q6000, as with a I7 this is straightforward), and that we all will learn from this (Note: I thought about a parallel Perft but did not go into this solution).
Bert
Hi Bert, and the others,
do you need two generators, one very optimised for the tree exploration and another for man machine needs ?
The points are the following :

In the above diagram the "optimised" generator do not keep as a result that the capture is made by white 29

In this diagram, you can verify that white as only one legal move but the "optimised" generator do not tell us the order of captures which is necessary for a human body.
What is your feeling ?
Gérard
It seems now that our move generators are rather close from each other and very optimised but now I have a new question :BertTuyt wrote:I have now totally changed my MoveGenerator based on suggestions from several people here on this forum.
Bert
do you need two generators, one very optimised for the tree exploration and another for man machine needs ?
The points are the following :

In the above diagram the "optimised" generator do not keep as a result that the capture is made by white 29

In this diagram, you can verify that white as only one legal move but the "optimised" generator do not tell us the order of captures which is necessary for a human body.
What is your feeling ?
Gérard
-
- Posts: 1722
- Joined: Wed Apr 14, 2004 16:04
- Contact:
I agree with you, Gérard. Also for the DamExchange protocol you need to know From and To separately. I think Dam 2.2 animates the complete capture sequence in the right order. Of course, in really complicated positions, such as Ed's famous Random position, you have to pick one from 530 possible solutions!TAILLE wrote:Hi Bert, and the others,
It seems now that our move generators are rather close from each other and very optimised but now I have a new question :BertTuyt wrote:I have now totally changed my MoveGenerator based on suggestions from several people here on this forum.
Bert
do you need two generators, one very optimised for the tree exploration and another for man machine needs ?
What is your feeling ?
Gérard
Rein
- FeikeBoomstra
- Posts: 306
- Joined: Mon Dec 19, 2005 16:48
- Location: Emmen
Congratulations Bert,
I also experimented with a bitboard for start an stop. It works fine for man moves and man captures, but for king moves/captures I don't see a fast implementation.
Do you use a special trick I missed. My cap_king routine was something like:
I also experimented with a bitboard for start an stop. It works fine for man moves and man captures, but for king moves/captures I don't see a fast implementation.
Do you use a special trick I missed. My cap_king routine was something like:
Code: Select all
void cap_king(Position pos, BitBoard org_pos, BitBoard captured) {
BitBoard cur_pos = org_pos;
BitBoard last_captured;
while (cur_pos & pos->empty)
cus_pos = nw(cur_pos);
if ((cur_pos & pos->other_piece) && !(cur_pos & captured)) {
last_captured = cur_pos;
cur_pos = nw(cur_pos)
if (cur_pos & pos->empty) {
while (true) {
captured = captured | last_captured;
cap_king(pos, cur_position, captured)
captured = captured ^ last_captured;
cus_pos = nw(cur_pos);
if (!(cur_pos & pos->empty)) break;
}
}
}
// same for all directions
if (captured !=0) store_capture(....)
return;
}
Hi Feike,
First point : why not writing
Second point : it seems that the cap_king(pos, cur_position, captured) test the four directions which is not necessary when you already know the direction of the last capture.
Gérard
I think you can improve you routine in various places.FeikeBoomstra wrote:Code: Select all
while (true) { captured = captured | last_captured; cap_king(pos, cur_position, captured) captured = captured ^ last_captured; cus_pos = nw(cur_pos); if (!(cur_pos & pos->empty)) break; }
First point : why not writing
Code: Select all
captured = captured | last_captured;
while (true) {
cap_king(pos, cur_position, captured)
cus_pos = nw(cur_pos);
if (!(cur_pos & pos->empty)) break;
}
captured = captured ^ last_captured;
Gérard
- FeikeBoomstra
- Posts: 306
- Joined: Mon Dec 19, 2005 16:48
- Location: Emmen