Perft
-
- Posts: 1368
- Joined: Thu Jun 20, 2013 17:16
- Real name: Krzysztof Grzelak
Re: Perft
Joost please write that generate base ends 4x4 or 3x3.
Re: Perft
Joost, I have now also written some routines to calculate the Magic Numbers.
I encountered the same problem as you, for the squares 23 and 28 , so I also might need to increase the table with a factor of 2.
I used the random approach, more or less based on the info in https://chessprogramming.wikispaces.com ... for+Magics
Wit 1M Random trials for every square (and the resulting failures for square 23 and 28) it took 11.4 seconds.
See below.
I didn't test and/implemented the magic numbers into my MoveGenerator, so don't know the time impact, and there could be a bug somewhere still hidden.
The real challenge is to find a very quick magic-ish method for detecting a King-capture, so work to do.....
Bert
I encountered the same problem as you, for the squares 23 and 28 , so I also might need to increase the table with a factor of 2.
I used the random approach, more or less based on the info in https://chessprogramming.wikispaces.com ... for+Magics
Wit 1M Random trials for every square (and the resulting failures for square 23 and 28) it took 11.4 seconds.
See below.
I didn't test and/implemented the magic numbers into my MoveGenerator, so don't know the time impact, and there could be a bug somewhere still hidden.
The real challenge is to find a very quick magic-ish method for detecting a King-capture, so work to do.....
Bert
Code: Select all
T = 0.0, S = 1, C = 128, M = c2030600900100
T = 0.0, S = 2, C = 128, M = 1900440848020001
T = 0.0, S = 3, C = 128, M = 740402000024203
T = 0.0, S = 4, C = 128, M = 2008100810000410
T = 0.0, S = 5, C = 256, M = 8080502080200
T = 0.0, S = 6, C = 128, M = 2190010901202000
T = 0.0, S = 7, C = 128, M = 1042020408101300
T = 0.0, S = 8, C = 128, M = 4126022100000000
T = 0.0, S = 9, C = 128, M = 1841010024084c00
T = 0.0, S = 10, C = 128, M = 404045020002
T = 0.0, S = 11, C = 128, M = 481201010080844
T = 0.1, S = 12, C = 512, M = 10080200100400
T = 0.1, S = 13, C = 512, M = 30100201080020
T = 0.1, S = 14, C = 512, M = 4410008038801802
T = 0.1, S = 15, C = 128, M = 200c010100810002
T = 0.1, S = 16, C = 128, M = 454110102020020
T = 0.1, S = 17, C = 512, M = 801028010020200
T = 0.2, S = 18, C = 2048, M = 2000800840202
T = 0.3, S = 19, C = 2048, M = 9004002004002a
T = 0.3, S = 20, C = 128, M = 8400104040003
T = 0.3, S = 21, C = 128, M = 14400602404020
T = 0.6, S = 22, C = 2048, M = 480080020040100
T = 5.7, S = 23, C = 8192, M = 0
T = 5.8, S = 24, C = 512, M = 20c0020801010
T = 5.8, S = 25, C = 128, M = 60408308010008
T = 5.8, S = 26, C = 128, M = 1080408014201200
T = 5.9, S = 27, C = 512, M = 481206010020040
T = 11.0, S = 28, C = 8192, M = 0
T = 11.0, S = 29, C = 2048, M = 109002000800800
T = 11.0, S = 30, C = 128, M = 2100441804a0100c
T = 11.0, S = 31, C = 128, M = 10200400110000
T = 11.1, S = 32, C = 2048, M = 10600400200801
T = 11.3, S = 33, C = 2048, M = 1010008100082000
T = 11.3, S = 34, C = 512, M = 4006100a40020000
T = 11.3, S = 35, C = 128, M = 10c0500420900800
T = 11.3, S = 36, C = 128, M = 104100205a000a
T = 11.3, S = 37, C = 512, M = 420100804810001
T = 11.3, S = 38, C = 512, M = 21220100202004
T = 11.3, S = 39, C = 512, M = 2820200100402000
T = 11.3, S = 40, C = 128, M = 901020600444080
T = 11.3, S = 41, C = 128, M = 4004080404401180
T = 11.3, S = 42, C = 128, M = 4420402030102000
T = 11.3, S = 43, C = 128, M = 180010102020104
T = 11.3, S = 44, C = 128, M = 113404200840801
T = 11.3, S = 45, C = 128, M = 102008010842228
T = 11.4, S = 46, C = 256, M = 4408008040008
T = 11.4, S = 47, C = 128, M = 11102082b040000
T = 11.4, S = 48, C = 128, M = 10340206190
T = 11.4, S = 49, C = 128, M = 5000408220101002
T = 11.4, S = 50, C = 128, M = 804040a02010010
-
- Posts: 471
- Joined: Wed May 04, 2016 11:45
- Real name: Joost Buijs
Re: Perft
Bert,
Nice to hear that you are also busy with magic multiplication for move-generation.
When you are generating king-moves it helps, for men-moves it is unnecessary, I generate these in the normal way.
Generating magic numbers works best when the number of 1-bits in the multiplier are not too high, I made a 64 bit random generator with an adjustable number of 1-bits. Something like 8 to 10 1-bits seems to work fine for this application (draughts).
I tried hard to find a magic multiplier for the long diagonals, like you mentioned I could not find any, in that case I added an extra index bit which is no problem at all.
Last week I was fighting with my table-base generator, it seems to run very fast but there are still some small bugs I have to resolve, it is all about assumptions. The next three weeks I'm on holidays, at the beginning of September I will continue with the program.
Joost
Nice to hear that you are also busy with magic multiplication for move-generation.
When you are generating king-moves it helps, for men-moves it is unnecessary, I generate these in the normal way.
Generating magic numbers works best when the number of 1-bits in the multiplier are not too high, I made a 64 bit random generator with an adjustable number of 1-bits. Something like 8 to 10 1-bits seems to work fine for this application (draughts).
I tried hard to find a magic multiplier for the long diagonals, like you mentioned I could not find any, in that case I added an extra index bit which is no problem at all.
Last week I was fighting with my table-base generator, it seems to run very fast but there are still some small bugs I have to resolve, it is all about assumptions. The next three weeks I'm on holidays, at the beginning of September I will continue with the program.
Joost
Re: Perft
Joost, you are right for normal man moves and captures, the traditional bitboard and shift operations work best.
Magic is only relevant for King Captures (although I don't have an idea yet how to implement this in an easy way), and King moves (sliders).
My Holiday ends this weekend, so I hope to finalize the Magic implementation, and post perft results.
Enjoy your 3 weeks holiday.
Bert
Magic is only relevant for King Captures (although I don't have an idea yet how to implement this in an easy way), and King moves (sliders).
My Holiday ends this weekend, so I hope to finalize the Magic implementation, and post perft results.
Enjoy your 3 weeks holiday.
Bert
-
- Posts: 1722
- Joined: Wed Apr 14, 2004 16:04
- Contact:
Re: Perft
King capture generation is not that relevant, because after the first capture, you are not doing bit-parallel operations anymore. It's only for the detection of all possible king capture targets where this helps.BertTuyt wrote:Joost, you are right for normal man moves and captures, the traditional bitboard and shift operations work best.
Magic is only relevant for King Captures (although I don't have an idea yet how to implement this in an easy way), and King moves (sliders).
My Holiday ends this weekend, so I hope to finalize the Magic implementation, and post perft results.
Enjoy your 3 weeks holiday.
Bert
-
- Posts: 471
- Joined: Wed May 04, 2016 11:45
- Real name: Joost Buijs
Re: Perft
For each occupancy I have 4 masks for the 4 different directions (-6,-5,+5,+6), unfortunately I need one extra mask to prevent a capture from going past the edge of the board.BertTuyt wrote:Joost, you are right for normal man moves and captures, the traditional bitboard and shift operations work best.
Magic is only relevant for King Captures (although I don't have an idea yet how to implement this in an easy way), and King moves (sliders).
Bert
Maybe there is a way to omit that extra mask, but I don't think this is relevant because my move-generator is fast enough already.
As usual the biggest time consumers are the evaluation function and probing the hash-table.
Although I'm on holidays I will sometimes take a look here.
Joost
Edit:
Of course the above is meant for king-captures, for king-sliders I use only 1 mask, all directions at once.
For king-sliders magics are usable too because you can determine in a few operations all the squares you are allowed to move to, I don't see how you can do this with normal masks without shifting.
Storing the moves and checking for redundant moves probably takes more time than generating them, this limits the maximum speed you are able to reach.
-
- Posts: 471
- Joined: Wed May 04, 2016 11:45
- Real name: Joost Buijs
Re: Perft
Actually I started with the same code from the chess-programming wiki and modified it to some extent.BertTuyt wrote:Joost, I have now also written some routines to calculate the Magic Numbers.
I encountered the same problem as you, for the squares 23 and 28 , so I also might need to increase the table with a factor of 2.
I used the random approach, more or less based on the info in https://chessprogramming.wikispaces.com ... for+Magics
Wit 1M Random trials for every square (and the resulting failures for square 23 and 28) it took 11.4 seconds.
See below.
Since my move-generator seems to works flawlessly I assume that my magic multipliers are ok.
The code for my magics generator is on dropbox, if you are interested you can download it here:
https://www.dropbox.com/s/f129p9a81xcpl ... s.rar?dl=0
It is a small VS2015 project and it uses BMI2, since you have the same processor it should work without modification.
When you are interested I can also share the code that I use to generate the table with masks for each square and occupancy.
Joost
I've added some code that shows how to initialize the masks and how to use them, maybe you will find it useful.
You can download it here:
https://www.dropbox.com/s/upx4jhls0bwlr ... s.rar?dl=0
Re: Perft
Joost, thanks for sharing.
In the meantime im back in Switserland (where I live and work), and holiday has come to an end.
As I did not bother to make a connection to my faster 4 Ghz 8-core machine in Holland, I have to rely on a notebook here with a 2.7 Ghz processor (HP ProBook 6570b).
Yesterday ( = Saturday) I tried to debug the Magic code which I had started the weeks before.
It also now works (as the Perft numbers are the same), but for an unknown reason it is not faster.
I only use Magic for the King Moves, so not the King Capture detection (or alike).
Next to that I optmized a little the current non-Magic based MoveGenerator.
On my Notebook I got the next results for the 3 welknown positions.
1. Perft(11), 15.19s, 15.27s, 15.20s
2. Perft(9), 8.51s, 8.46s, 8.45s
3. Perft(15), 4.57s, 4.53s, 4.57s
Think especially the Perft results for 2. and 3. are fast, and Im curious what they would do on my own (or a similar machine).
I will do some more tests, to see if I can optimize further.
Next to that I will make a small Perft tester, so completely independant on my program (as the Perft now is embedded in my program, which could effect timing), to avoid any disturbances.
I will put the executable and source code on my Dropbox (think next weekend), so others (and you), can test.
Im also interested what the normal Visual Studio compiling will do (and optimization), and what the (positive) effect is of the Intel compiler.
Last but not least , hope/assume that Rein has some suggestions for my code, as he is the Master of programming here....
Bert
In the meantime im back in Switserland (where I live and work), and holiday has come to an end.
As I did not bother to make a connection to my faster 4 Ghz 8-core machine in Holland, I have to rely on a notebook here with a 2.7 Ghz processor (HP ProBook 6570b).
Yesterday ( = Saturday) I tried to debug the Magic code which I had started the weeks before.
It also now works (as the Perft numbers are the same), but for an unknown reason it is not faster.
I only use Magic for the King Moves, so not the King Capture detection (or alike).
Next to that I optmized a little the current non-Magic based MoveGenerator.
On my Notebook I got the next results for the 3 welknown positions.
1. Perft(11), 15.19s, 15.27s, 15.20s
2. Perft(9), 8.51s, 8.46s, 8.45s
3. Perft(15), 4.57s, 4.53s, 4.57s
Think especially the Perft results for 2. and 3. are fast, and Im curious what they would do on my own (or a similar machine).
I will do some more tests, to see if I can optimize further.
Next to that I will make a small Perft tester, so completely independant on my program (as the Perft now is embedded in my program, which could effect timing), to avoid any disturbances.
I will put the executable and source code on my Dropbox (think next weekend), so others (and you), can test.
Im also interested what the normal Visual Studio compiling will do (and optimization), and what the (positive) effect is of the Intel compiler.
Last but not least , hope/assume that Rein has some suggestions for my code, as he is the Master of programming here....
Bert
Re: Perft
Here already the source code for my (White) King Move based upon Magic.
WhiteKingMove()
Magic()
WhiteKingMove()
Code: Select all
int CMoveGenerate128::WhiteKingMoveM128(PTSTACK pTDepth)
{
DWORD dKingPosition;
BITBOARD *pbbField;
pbbField = pTDepth->pbbField;
PLOOK rlsp = pTDepth->NextMove;
BITBOARD bbWhitePiece = pbbField[BB_WHITEPIECE];
BITBOARD bbBlackPiece = pbbField[BB_BLACKPIECE];
BITBOARD bbKing = pbbField[BB_KING];
BITBOARD bbWhiteKing = bbWhitePiece & bbKing;
while (bbWhiteKing) {
BITSCANFORWARD64(&dKingPosition, bbWhiteKing);
BITBOARD bbKingPosition = (BITBOARD)1<<dKingPosition;
bbWhiteKing ^= bbKingPosition;
BITBOARD bbMagicMove = Magic(dKingPosition, bbWhitePiece | bbBlackPiece) & pbbField[BB_EMPTY];
while (bbMagicMove) {
BITBOARD bbMove = bbMagicMove & (~bbMagicMove + 1);
bbMagicMove ^= bbMove;
// Copy Position BitBoards
rlsp->bbField[BB_WHITEPIECE] = bbWhitePiece ^ (bbKingPosition ^ bbMove);
rlsp->bbField[BB_BLACKPIECE] = bbBlackPiece;
rlsp->bbField[BB_KING] = bbKing ^ (bbKingPosition ^ bbMove);
++rlsp;
++pTDepth->iXMoves;
}
}
pTDepth->NextMove = rlsp;
return (pTDepth->iXMoves);
}
Code: Select all
BITBOARD CMoveGenerate128::Magic(DWORD dKingPosition, BITBOARD bbOccupied)
{
BITBOARD bbMagicMask = MagicA[dKingPosition].bbMask & bbOccupied;
uint64 uiMagic = MagicA[dKingPosition].uiMagic;
int iShift = 64 - MagicA[dKingPosition].iShift;
int iIndex = (int)(uint64(bbMagicMask * uiMagic) >> iShift);
return (pBBMagic[MagicA[dKingPosition].iOffset + iIndex]);
}
-
- Posts: 471
- Joined: Wed May 04, 2016 11:45
- Real name: Joost Buijs
Re: Perft
It is possible that using magics for king-move generation doesn't help much, I have no means of comparing it with the tradition method of shifts and masks because I never programmed this.BertTuyt wrote: Yesterday ( = Saturday) I tried to debug the Magic code which I had started the weeks before.
It also now works (as the Perft numbers are the same), but for an unknown reason it is not faster.
I only use Magic for the King Moves, so not the King Capture detection (or alike).
Intuitively I have the feeling it should help but it depends on many other things.
For king-captures however I'm sure it works, you only need a few operations to determine whether a capture is possible in a certain direction or not and you only have to scan the empty squares in that direction when there is no capture possible.
Especially during endgames with a lot of empty squares and a few pieces scattered around it gives an advantage.
The Intel compiler generates roughly 7 to 10% faster code, it can optimize for specific architectures (in our case Haswell) and it has better PGO and vectorization.BertTuyt wrote: Im also interested what the normal Visual Studio compiling will do (and optimization), and what the (positive) effect is of the Intel compiler.
Bert
Recently Microsoft improved the optimization of MSVC as well: https://blogs.msdn.microsoft.com/vcblog ... optimizer/
A few weeks ago I installed VS2015 update 3 but I could not find much improvement, it is possible they didn't incorporate the new optimizer yet.
To determine king captures I basically do the following: ((magic_mask_for_current_direction & opposite_pieces & ~edge_squares) >> current_direction) & empty_squares;
This gives you the first empty square after the opposite piece you are allowed to capture.
It is possible to generate another set of magic_masks to avoid masking the edge-squares but that is bean counting, you only have to mask the opposite_pieces and edge_squares once for all directions.
Joost
Edit:
By looking at my code I see that I do it the other way around, I shift the empty squares in the negative direction to get the captured piece.
Since I have a different set of masks for the king-moves anyway I tried to remove the edge_squares from the magic_masks used for king-captures directly when building the table, somehow this does not work and I don't understand why, this is really something I have to look at.
Edit:
I found the culprit and it can only be solved by doubling table-size, I don't know what is worse, one extra instruction or a bigger burden upon the cache by a doubled table-size.
The next two weeks I will leave it alone, after this I will start working on my search and DXP otherwise my program will never play a single game.
-
- Posts: 1722
- Joined: Wed Apr 14, 2004 16:04
- Contact:
Re: Perft
Why do you need the edge_squares masking? If you have the "ghost" squares at each edge, then the opposite_pieces & (empty_squares << current_direction) would be the cheapest way to isolate all opposite pieces with an empty piece behind them.Joost Buijs wrote: To determine king captures I basically do the following: ((magic_mask_for_current_direction & opposite_pieces & ~edge_squares) >> current_direction) & empty_squares;
This gives you the first empty square after the opposite piece you are allowed to capture.
Re: Perft
Here the link to my Dropbox folder containing a Visual Studio 2015 Perft project (and Perft executable).
Including the Movegen source code.
Interesting what the Perft results are on a fast 4 Ghz machine and/or with better compiler optimization and/or Intel compiler.
Think it still could be faster.
Note: This version without the (in my case) Magic.
Hope that some have suggestions for better coding (Rein ) and/or optimization....
Bert
https://www.dropbox.com/sh/lesd96lzimh4 ... iTk_a?dl=0
Including the Movegen source code.
Interesting what the Perft results are on a fast 4 Ghz machine and/or with better compiler optimization and/or Intel compiler.
Think it still could be faster.
Note: This version without the (in my case) Magic.
Hope that some have suggestions for better coding (Rein ) and/or optimization....
Bert
https://www.dropbox.com/sh/lesd96lzimh4 ... iTk_a?dl=0
-
- Posts: 471
- Joined: Wed May 04, 2016 11:45
- Real name: Joost Buijs
Re: Perft
To be honest, I never looked at it in very great detail, I use ghost squares at 10,21,32, and 43 and I assumed that when there sits an enemy piece on the edge this test can fail that's why I mask out the edge-squares.Rein Halbersma wrote:Why do you need the edge_squares masking? If you have the "ghost" squares at each edge, then the opposite_pieces & (empty_squares << current_direction) would be the cheapest way to isolate all opposite pieces with an empty piece behind them.Joost Buijs wrote: To determine king captures I basically do the following: ((magic_mask_for_current_direction & opposite_pieces & ~edge_squares) >> current_direction) & empty_squares;
This gives you the first empty square after the opposite piece you are allowed to capture.
The funny thing is that the test fails indeed when I remove that mask and I still believe it is needed, but I will take another look at it.
My empty-square mask is the inverse of the occupied square-mask and it is possible that I should mark the ghost bits as being not empty, that will take time as well.
It is not so important anyway, with or without that edge-mask there won't be much difference in speed.
Last edited by Joost Buijs on Mon Aug 15, 2016 08:10, edited 2 times in total.
-
- Posts: 471
- Joined: Wed May 04, 2016 11:45
- Real name: Joost Buijs
Re: Perft
Bert,BertTuyt wrote:Here the link to my Dropbox folder containing a Visual Studio 2015 Perft project (and Perft executable).
Including the Movegen source code.
Interesting what the Perft results are on a fast 4 Ghz machine and/or with better compiler optimization and/or Intel compiler.
Think it still could be faster.
Note: This version without the (in my case) Magic.
Hope that some have suggestions for better coding (Rein ) and/or optimization....
Bert
https://www.dropbox.com/sh/lesd96lzimh4 ... iTk_a?dl=0
When I have some time this evening I will try to compile your code with the Intel compiler with PGO to see what it does on my machine which runs default at 3.6 GHz.
Because I don't want to change my BIOS settings (and I can't at the moment) you have to subtract ~11% from the times I measure to get comparable results.
Joost