Clone Conflict

Discussion about development of draughts in the time of computer and Internet.
Rein Halbersma
Posts: 1722
Joined: Wed Apr 14, 2004 16:04
Contact:

Re: Clone Conflict

Post by Rein Halbersma » Fri Aug 26, 2016 10:23

Joost Buijs wrote: The time I did this was in 2000 or 2001, there were only a few smp chess-programs, and there wasn't much knowledge about it besides some articles in the ICCA journal.
Nowadays it is a lot easier, compiler support is better, and there is a lot more knowledge about smp in general.
If I had to do it today I would also peek at other programs to get an idea, but that doesn't mean copying code.
My epxerience with reading other people's code is that it's often in a completely different style than what I would use (I'm sure people think the same about my code :)). So literal copying is out of the question for me. Not sure how you think about "translating" other code to match your style, data structures etc.

For me the real thrill of programming is finding new ways of doing things, but before that you have to really understand the old ways. That's why I am so interested in e.g. your magic code, not because I want to copy it, but to see how it is done, so I can try and tinker with it and who know mess it up in bad way :)

BertTuyt
Posts: 1592
Joined: Wed Sep 01, 2004 19:42

Re: Clone Conflict

Post by BertTuyt » Fri Aug 26, 2016 12:30

Rein, altough I excluded Magic in the recent MoveGenerator() implementation, I have the non-selfexplaining sources still available. Both how the Magic numbers are calculated, as the implementation in the MoveGenerator().
If you want/need I can post the sources.

Bert

Rein Halbersma
Posts: 1722
Joined: Wed Apr 14, 2004 16:04
Contact:

Re: Clone Conflict

Post by Rein Halbersma » Fri Aug 26, 2016 13:01

BertTuyt wrote:Rein, altough I excluded Magic in the recent MoveGenerator() implementation, I have the non-selfexplaining sources still available. Both how the Magic numbers are calculated, as the implementation in the MoveGenerator().
If you want/need I can post the sources.

Bert
yes please! and btw, why don't you try GitHub or BitBucket to post your code? Ed also use it.

Joost Buijs
Posts: 471
Joined: Wed May 04, 2016 11:45
Real name: Joost Buijs

Re: Clone Conflict

Post by Joost Buijs » Sat Aug 27, 2016 09:38

Rein Halbersma wrote: For me the real thrill of programming is finding new ways of doing things, but before that you have to really understand the old ways. That's why I am so interested in e.g. your magic code, not because I want to copy it, but to see how it is done, so I can try and tinker with it and who know mess it up in bad way :)
Rein,

Since you are interested in the magic code I will put my perft() project on Dropbox.
Basically it contains the whole move-generator, it is by no means perfect and there is still room for improvement.
Since I really want to continue work on my engine I decided to leave the move-generator as it is for the moment.

You can download it here:

https://www.dropbox.com/s/fpo0qh5qo7muw ... t.rar?dl=0

Joost

BertTuyt
Posts: 1592
Joined: Wed Sep 01, 2004 19:42

Re: Clone Conflict

Post by BertTuyt » Sat Aug 27, 2016 11:27

Herewith a previous version of the MoveGenerator with Magic for the King Moves.

https://www.dropbox.com/sh/umqwdg2h0f9t ... UW9Qa?dl=0

Bert

Rein Halbersma
Posts: 1722
Joined: Wed Apr 14, 2004 16:04
Contact:

Re: Clone Conflict

Post by Rein Halbersma » Sun Aug 28, 2016 01:08

Joost Buijs wrote:
Rein Halbersma wrote: For me the real thrill of programming is finding new ways of doing things, but before that you have to really understand the old ways. That's why I am so interested in e.g. your magic code, not because I want to copy it, but to see how it is done, so I can try and tinker with it and who know mess it up in bad way :)
Rein,

Since you are interested in the magic code I will put my perft() project on Dropbox.
Basically it contains the whole move-generator, it is by no means perfect and there is still room for improvement.
Since I really want to continue work on my engine I decided to leave the move-generator as it is for the moment.

You can download it here:

https://www.dropbox.com/s/fpo0qh5qo7muw ... t.rar?dl=0

Joost
Thanks, also to Bert: much appreciated!

Joost Buijs
Posts: 471
Joined: Wed May 04, 2016 11:45
Real name: Joost Buijs

Re: Clone Conflict

Post by Joost Buijs » Sun Aug 28, 2016 06:46

Rein,

The move-generator I use in my engine differs in one aspect from the one in the perft() program, I store a whole move in 44 bit instead of 192. This is somewhat slower but not by much, at most a few percent. Probably I will regain that speed in my search() because the move-list is 3 times as small with less burden upon the cache.

For the hash-table entries I use a 64 bit key, 4 bit flags, 16 bit value and 44 bits move, a complete entry is 128 bits.
I use a bucket system with 4 entries per bucket, this fits nicely into a 64 byte cache-line.
The drawback is that this relies upon features of modern CPU's (>= Haswell) so my program will never run on older hardware, something I don't have in mind anyway.

Joost

Edit:

I forgot to mention that the draft (or depth as I call it) is encoded in the lower 8 bits of the key field.
Since the the bucket number is determined by the lower key bits the key can be easily reconstructed later.
Last edited by Joost Buijs on Wed Aug 31, 2016 09:51, edited 2 times in total.

BertTuyt
Posts: 1592
Joined: Wed Sep 01, 2004 19:42

Re: Clone Conflict

Post by BertTuyt » Sun Aug 28, 2016 14:04

Joost, maybe I missed the explanation in a previous post, but how are you able to include the whole move in 44 bits (only)?

One of the things I (related to different move format) I still want to test, is to include the move into 64 bits, by combing the new position Bitboards into 1 Bitboard ( so bbMove = bbWhitePiece | bbBlackPiece ).
With (to my knowledge) 2 disadvantages if in a capture the from and to square are the same (but that is also the case in my current implementation, where I only store bitboards) you can not extract the info from the Bitboards, and you still need to check on promotions (in general).

Think we also disussed this some years ago, but I never tested it.
Disadvantage is that a move is not a simple pointer update operation, but you need to calculate the new position based upon the (64bit) MoveInfo and the previous position.

Regarding the Hash-table entry I only store the Move-Index.
Maybe a disadvantage is that I need to call the MoveGenerator() first to get the real move (and I do a check if the Move-Index does not exceed the number of moves, basically an extra collision detection).
In your case it would be beneficial if the Hash-table entry creates an immediate cutoff.
If not, you need to call the MoveGenerator() anyway.

Bert

Rein Halbersma
Posts: 1722
Joined: Wed Apr 14, 2004 16:04
Contact:

Re: Clone Conflict

Post by Rein Halbersma » Sun Aug 28, 2016 16:07

BertTuyt wrote:Joost, maybe I missed the explanation in a previous post, but how are you able to include the whole move in 44 bits (only)?

One of the things I (related to different move format) I still want to test, is to include the move into 64 bits, by combing the new position Bitboards into 1 Bitboard ( so bbMove = bbWhitePiece | bbBlackPiece ).
With (to my knowledge) 2 disadvantages if in a capture the from and to square are the same (but that is also the case in my current implementation, where I only store bitboards) you can not extract the info from the Bitboards, and you still need to check on promotions (in general).

Think we also disussed this some years ago, but I never tested it.
Disadvantage is that a move is not a simple pointer update operation, but you need to calculate the new position based upon the (64bit) MoveInfo and the previous position.

Regarding the Hash-table entry I only store the Move-Index.
Maybe a disadvantage is that I need to call the MoveGenerator() first to get the real move (and I do a check if the Move-Index does not exceed the number of moves, basically an extra collision detection).
In your case it would be beneficial if the Hash-table entry creates an immediate cutoff.
If not, you need to call the MoveGenerator() anyway.

Bert
IIRC, Fabien uses a clever trick by storing all changed bits of a move in a single bitboard. I think he has the squares running from bits 1 through 54 (not 0 through 53) and then he uses bits 0...5 for the from square, and bits 58...63 for the dest square. These from/dest bits can never overlap with the bits of captured squares (because bits 1...5 are edge squares). It does require some masking to extract them again. Of course, you can easily encapsulate this by wrapping the bitboard in a class and defining from() and dest() member functions that do the masking in the implementation.

In fact, you could use this trick for any board that fits within 64 bits since even the first square on the 2nd row and last square on the second but last row are edge squares that cannot be captured. You don't need to use the fact that you have extra ghost bits at the end.

64-bit moves are also more compact to store in the move list, and for 10x10 boards with duplicate capture detection you never need more than 128 moves, so 128 * 8 = 1024 bytes for an entire move list. Just 16 cache lines.

I think Fabien's format is really interesting. Another interesting format is from Gerard, who uses a "delta" board with 1 bits for every bit that changes in the position. For circular captures, from == dest, and these bits cancel each (just initialize them with bitwise-xor). That will still allow to do make_move() on a position, but it will prevent printing the move when from == dest.

Also, for Thai and Russian draughts, Gerard's format is not suitable. For Thai draughts, you can land on a square where you also captured a piece (because Thai draughts immediately removes pieces after you jumped them). For Russian draughts, you can promote during a circular capture.

Joost Buijs
Posts: 471
Joined: Wed May 04, 2016 11:45
Real name: Joost Buijs

Re: Clone Conflict

Post by Joost Buijs » Sun Aug 28, 2016 19:23

BertTuyt wrote:Joost, maybe I missed the explanation in a previous post, but how are you able to include the whole move in 44 bits (only)?

One of the things I (related to different move format) I still want to test, is to include the move into 64 bits, by combing the new position Bitboards into 1 Bitboard ( so bbMove = bbWhitePiece | bbBlackPiece ).
With (to my knowledge) 2 disadvantages if in a capture the from and to square are the same (but that is also the case in my current implementation, where I only store bitboards) you can not extract the info from the Bitboards, and you still need to check on promotions (in general).

Think we also disussed this some years ago, but I never tested it.
Disadvantage is that a move is not a simple pointer update operation, but you need to calculate the new position based upon the (64bit) MoveInfo and the previous position.

Regarding the Hash-table entry I only store the Move-Index.
Maybe a disadvantage is that I need to call the MoveGenerator() first to get the real move (and I do a check if the Move-Index does not exceed the number of moves, basically an extra collision detection).
In your case it would be beneficial if the Hash-table entry creates an immediate cutoff.
If not, you need to call the MoveGenerator() anyway.

Bert
Bert,

This is very easy to do on a CPU that supports BMI2, since you can capture on the inner 32 squares only, you can pack the captured piece bitboard into the lower 32 bits of a 64 bit word using pext() and you can reverse this operation by using pdep(). When you have a board representation with ghost-squares at bit 10, 21, 32 and 43 you can do:

Code: Select all

// pack capture
packed_bb = _pext_u64(captured_bb, 0x0000F79EF3DE7BC0);
// unpack capture
captured_bb =_pdep_u64(packed_bb, 0x0000F79EF3DE7BC0);
These are both very fast instructions (2 clock cycles), unfortunately they are only supported by the latest CPU's.
When you want to emulate this in software it gets way too slow, an other option is to find a magic multiplier solution but that won't be as fast as the above.
I have no plans to support older CPU's with my program and within a few years every CPU will support these instructions.

It is not necessary to generate moves when you have a hash-move, a simple check for legality of the hash-move will do, without such a check there is a chance the engine might crash or gets messed up in case of a collision. In practice I never encountered problems without such a check, but it is better to be safe than sorry.
When the hash-move doesn't generate a beta cutoff you still have to generate all moves, in many cases this is not needed though.

Joost

BertTuyt
Posts: 1592
Joined: Wed Sep 01, 2004 19:42

Re: Clone Conflict

Post by BertTuyt » Sun Aug 28, 2016 20:25

Joost, so far I only used the BITSCAN and POPCNT instructions.
But you are right, the new processors have many interesting bit instructions, which in the past were inefficient with a software implementation.

Think (as an example) the often used find-least significant bit and clear, which is often used in the MoveGenerator() is also now also available as 1 instruction.
So I need to better study the new options :).

Bert

Joost Buijs
Posts: 471
Joined: Wed May 04, 2016 11:45
Real name: Joost Buijs

Re: Clone Conflict

Post by Joost Buijs » Mon Aug 29, 2016 08:03

BertTuyt wrote:Joost, so far I only used the BITSCAN and POPCNT instructions.
But you are right, the new processors have many interesting bit instructions, which in the past were inefficient with a software implementation.

Think (as an example) the often used find-least significant bit and clear, which is often used in the MoveGenerator() is also now also available as 1 instruction.
So I need to better study the new options :).

Bert
Yes there are several new instructions, pext() and pdep() being the most important ones, they have many uses.
The one still missing is to reverse bit-order in a byte or word, strange that Intel didn't implement this, even a lowly ARM processor has this.
Instructions like tzcnt() and lzcnt() are a tiny bit faster than their bitscan() relatives, at least in my engine I notice this, of course this depends upon other code as well.
Another interesting one is blsmsk() which I use in index calculation for the EGDB I'm working on.

Post Reply