Damage 15.3
Re: Damage 15.3
Krzysztof,
I uploaded a new version (damageengine153.exe), available in the download link (date should be today 5:06PM) could you check how this behaves?
I removed the large pages support as this did not work in my case.
As I did not optimize the engine for AMD, is still use the PEXT so speed might be slower as on my computer.
Could you do 2 tests, and sent me the screen shots of the damage engine output window.
Test 1. 1 thread , depth 26, tt-size = 25
Test 2. 16 threads, depth = 30, tt-size = 25
You could do the test without the Damage GUI.
herewith some (secret ) commands.
$ppproc 1
level depth=26
go
make a screenshot and restart the program
$ppproc 16
level depth=30
go
The $ppproc sets the number of cores, but you could also do this via the .ini file.
Bert
I uploaded a new version (damageengine153.exe), available in the download link (date should be today 5:06PM) could you check how this behaves?
I removed the large pages support as this did not work in my case.
As I did not optimize the engine for AMD, is still use the PEXT so speed might be slower as on my computer.
Could you do 2 tests, and sent me the screen shots of the damage engine output window.
Test 1. 1 thread , depth 26, tt-size = 25
Test 2. 16 threads, depth = 30, tt-size = 25
You could do the test without the Damage GUI.
herewith some (secret ) commands.
$ppproc 1
level depth=26
go
make a screenshot and restart the program
$ppproc 16
level depth=30
go
The $ppproc sets the number of cores, but you could also do this via the .ini file.
Bert
-
- Posts: 1368
- Joined: Thu Jun 20, 2013 17:16
- Real name: Krzysztof Grzelak
Re: Damage 15.3
I ask - I don't only know whether it is it.
- Attachments
-
- 1.jpg (174.39 KiB) Viewed 12176 times
-
- 2.jpg (172.51 KiB) Viewed 12176 times
Re: Damage 15.3
Krzystof, thanks.
I meant to do these simulations with the initial position (so white to move).
But I could at least reproduce the 1-core result.
See below screenshot fro the Damage GUI.
All numbers are the same, only (huge) difference is the search speed.
In your case 6.99 MN/s, in my case 17.86 MN/s.
I have no idea where this huge difference comes from?
For your information the NMED means, number of Nodes, Move Generator, Evaluation and Database calls.
It could be that part is related to the slower AMD BMI2 implementation (such as PEXT), but I did not expect such a huge difference.
Did you run something in parallel, and what is the clock speed of your processor?
I will think about another test to find out...
Bert
I meant to do these simulations with the initial position (so white to move).
But I could at least reproduce the 1-core result.
See below screenshot fro the Damage GUI.
All numbers are the same, only (huge) difference is the search speed.
In your case 6.99 MN/s, in my case 17.86 MN/s.
I have no idea where this huge difference comes from?
For your information the NMED means, number of Nodes, Move Generator, Evaluation and Database calls.
It could be that part is related to the slower AMD BMI2 implementation (such as PEXT), but I did not expect such a huge difference.
Did you run something in parallel, and what is the clock speed of your processor?
I will think about another test to find out...
Bert
-
- Posts: 1368
- Joined: Thu Jun 20, 2013 17:16
- Real name: Krzysztof Grzelak
Re: Damage 15.3
I don't know that. In the Kingsrow program it's about 16.000 kN/s, Damage is about 7.03 MN/s. AMD Ryzen Threadripper 1950X, 3.4GHz, 32 MB,
-
- Posts: 1368
- Joined: Thu Jun 20, 2013 17:16
- Real name: Krzysztof Grzelak
Re: Damage 15.3
I understand that the search speed for 1 core is weak Bert.
-
- Posts: 1368
- Joined: Thu Jun 20, 2013 17:16
- Real name: Krzysztof Grzelak
Re: Damage 15.3
I still have a question Bert. There is a Book folder in the program folder. I understand that this is not a book for Damage 15.3. What happens if I type option book = 1. Will the program use this book during the game?
Krzysztof.
Krzysztof.
Re: Damage 15.3
The program will most likely crash, or play non-sense moves.
I left the file by accident.
My first priority was to share something with the community which worked (so GUI and Engine).
Which seems to be the case.
Next priority is to test the improvements i have on my list, so strength will get closer to Scan/Kingsrow.
As Ed informed me with a 3-move ballot match with 1 min 75 moves/game 1 core and a moderate processor (i7-6500 at 2,4 GHZ) Damage was around 51.9 ELO weaker compared with Kingsrow.
I did the same test against Scan 3.1 with a 6p DB, 1 core (but my processor works at 4.3 GHZ), and measured a difference of 29 ELO (988 games, 90 lost, 9 win, 880 draw, and 9 unknown).
My own initial test revealed a 10 - 20 ELO difference (Scan and Kingsrow) , but that was with 6 cores (all running at 4.3 GHZ), and 2 min 65 moves/game, and a full loaded 7p DB, and a 2-move ballot match of 158 games.
So I will use the 3-move ballot match with 1 min for 65 moves, 1 core, and 6p DB, to test some improvements, as the sensitivity with these settings is much better to measure differences.
Also find out if I can improved the speed for the AMD (so replace PEXT).
I might already post an improved version within 1 week (if all works ).
Next I will work on reported bugs , such as the PDN bug in the GUI.
After this I will work on a new book file.
The routines I already have, so it is a matter of running my computer for a week continuously....
In the meantime awaiting some test results from Maximus and Sjende Blyn .....
Bert
I left the file by accident.
My first priority was to share something with the community which worked (so GUI and Engine).
Which seems to be the case.
Next priority is to test the improvements i have on my list, so strength will get closer to Scan/Kingsrow.
As Ed informed me with a 3-move ballot match with 1 min 75 moves/game 1 core and a moderate processor (i7-6500 at 2,4 GHZ) Damage was around 51.9 ELO weaker compared with Kingsrow.
I did the same test against Scan 3.1 with a 6p DB, 1 core (but my processor works at 4.3 GHZ), and measured a difference of 29 ELO (988 games, 90 lost, 9 win, 880 draw, and 9 unknown).
My own initial test revealed a 10 - 20 ELO difference (Scan and Kingsrow) , but that was with 6 cores (all running at 4.3 GHZ), and 2 min 65 moves/game, and a full loaded 7p DB, and a 2-move ballot match of 158 games.
So I will use the 3-move ballot match with 1 min for 65 moves, 1 core, and 6p DB, to test some improvements, as the sensitivity with these settings is much better to measure differences.
Also find out if I can improved the speed for the AMD (so replace PEXT).
I might already post an improved version within 1 week (if all works ).
Next I will work on reported bugs , such as the PDN bug in the GUI.
After this I will work on a new book file.
The routines I already have, so it is a matter of running my computer for a week continuously....
In the meantime awaiting some test results from Maximus and Sjende Blyn .....
Bert
-
- Posts: 1368
- Joined: Thu Jun 20, 2013 17:16
- Real name: Krzysztof Grzelak
Re: Damage 15.3
I understand and thank you for the information.
-
- Posts: 299
- Joined: Tue Jul 07, 2015 07:48
- Real name: Fabien Letouzey
Re: Damage 15.3
Scan doesn't use PEXT. Not only it's crippled on AMD, as you just noticed, but it won't scale in a world that will increasingly use ARM (which has bit reverse BTW).
Here is an explanation that I sent to Rein a few years ago; hopefully you can make sense of it (fixed-width font mandatory):
---
It's easier with 4x4 patterns, as no special numbering is needed. It is about copying regions around by means of mask + shift.
This 4x8 region is already in place:
A B - - -
C D - - -
E F - - -
G H - - -
I J - - -
K L - - -
M N - - -
O P - - -
- - - - -
- - - - -
Each letter represents a bit; the rest is masked out. This region contains 3 4x4 patterns: A-H, E-L, and I-P.
Now this overlapping 4x8 one needs to be moved to the north-east (not an exact diagonal):
- - - - -
- - - - -
E F - - -
G H - - -
I J - - -
K L - - -
M N - - -
O P - - -
Q R - - -
S T - - -
The last pattern of the left 4 files appears: M-T.
By combination (OR), we now obtain this:
A B E F -
C D G H -
E F I J -
G H K L -
I J M N -
K L O P -
M N Q R -
O P S T -
- - - - -
- - - - -
The left 4 files are from the first region and the other 4, the second region. Now read the first two lines (consecutive bits). They contain the letters A-H in some permutation, plus a one-bit hole. This corresponds to one 4x4 pattern. The next two lines: E-L, another pattern. Same for I-P and M-T.
In other words, I exploit the regularity of 4x4 patterns.
There is an extra bit on every line; I have two things to say about this:
- it brings some hole in the index; that is acceptable
- we can see that there's not enough room for a third copy
The last point is why I need an extra bit on every rank for 4x6 patterns. With the 13x10 board, the left 4 files are:
A B - - - -
C D - - - - -
E F - - - -
G H - - - - -
I J - - - -
K L - - - - -
M N - - - -
O P - - - - -
Q R - - - -
S T - - - -
Only 3 patterns this time: A-L, E-P (optional), and I-T. If I make two copies with the same general idea I obtain this:
A B E F I J
C D G H K L -
E F I J M N
G H K L O P -
I J M N Q R
K L O P S T -
M N Q R - -
O P S T - - -
Q R - - - -
S T - - - -
Now read the first 3 groups of 2 lines each: A-L, E-P, and I-T. No hole, which is pretty much mandatory for large patterns.
I think this method is more or less isomorphic to magic multiplication with very few set bits. Multiplication seems overkill for only 2 or 3 bits, and I'm not even sure it is faster. Also I choose right shifts, which cannot be accomplished by that method. The idea also works with left shifts and therefore multiplication; I have checked.
Fabien.
Re: Damage 15.3
Krzysztof, I want to get an idea of your Threadripper speed, and if the PEXT is the main reason for the lower nodes/second.
Could you type in the Damage engine the command #perft 11, to start a perft.
You should get next output.
As perft (movegenerator) does not use PEXT it might give some clues about actual speed of your processor.
Thanks for your support,
Bert
Could you type in the Damage engine the command #perft 11, to start a perft.
You should get next output.
As perft (movegenerator) does not use PEXT it might give some clues about actual speed of your processor.
Thanks for your support,
Bert
-
- Posts: 1368
- Joined: Thu Jun 20, 2013 17:16
- Real name: Krzysztof Grzelak
Re: Damage 15.3
Here you go Bert. I'm not sure if that's what you mean.
- Attachments
-
- Damage.jpg (154.94 KiB) Viewed 11460 times
Re: Damage 15.3
Krzysztof, thanks.
Were these results with your ThreadRipper, as the timing difference for PERFT is smaller then I would expect based upon the clock frequency (3.4 GHZ versus 4.3 GHZ).
It could be that the test run in turbo-boost so the actual frequency was somewhat higher.
Anyway if it was the ThreadRipper then it seems that the PEXT is really a speed killer on AMD.
Bert
Were these results with your ThreadRipper, as the timing difference for PERFT is smaller then I would expect based upon the clock frequency (3.4 GHZ versus 4.3 GHZ).
It could be that the test run in turbo-boost so the actual frequency was somewhat higher.
Anyway if it was the ThreadRipper then it seems that the PEXT is really a speed killer on AMD.
Bert
-
- Posts: 1368
- Joined: Thu Jun 20, 2013 17:16
- Real name: Krzysztof Grzelak
Re: Damage 15.3
I think you're right. I think PEXT and bmi2 much faster in Intel. As for multithreading here, I think (maybe I'm wrong) that he is on the processor AMD. Of course, it still depends what processors are thinking. Another matter of Bert is the price of the processor is crucial when buying.
Re: Damage 15.3
Do I understand correctly Damage 15.3 is based on some kind of Machine Learning but not based on a neural network approach as is used for Lc0 in chess?
Re: Damage 15.3
Damage 15.3 is based upon the ML principles as first applied in draughts by Fabien in Scan.
Later other programs like Kingsrow and Maximus have also implemented this method.
Bert
Later other programs like Kingsrow and Maximus have also implemented this method.
Bert