![](https://damforum.nl/bb3/images/ua.png)
Damage 15.3
-
- Posts: 1722
- Joined: Wed Apr 14, 2004 16:04
- Contact:
Re: Damage 15.3
The 4 hours of training for AlphaZero corresponds to about 1700 years of computing on a consumer-grade pc (estimate by the author of Leela-Zero, who managed to get this much resources by crowd-sourcing the computation).
Re: Damage 15.3
Awesome,Rein Halbersma wrote: ↑Fri Feb 21, 2020 16:32The 4 hours of training for AlphaZero corresponds to about 1700 years of computing on a consumer-grade pc (estimate by the author of Leela-Zero, who managed to get this much resources by crowd-sourcing the computation).
I understand why A0 it's so strongh. No way
![😮](http://cdn.jsdelivr.net/emojione/assets/3.1/png/64/1f62e.png)
![😮](http://cdn.jsdelivr.net/emojione/assets/3.1/png/64/1f62e.png)
![😮](http://cdn.jsdelivr.net/emojione/assets/3.1/png/64/1f62e.png)
![😮](http://cdn.jsdelivr.net/emojione/assets/3.1/png/64/1f62e.png)
![😮](http://cdn.jsdelivr.net/emojione/assets/3.1/png/64/1f62e.png)
![😮](http://cdn.jsdelivr.net/emojione/assets/3.1/png/64/1f62e.png)
-
- Posts: 1722
- Joined: Wed Apr 14, 2004 16:04
- Contact:
Re: Damage 15.3
Yes, Google has a lot of resourcesSidiki wrote: ↑Fri Feb 21, 2020 16:40Awesome,Rein Halbersma wrote: ↑Fri Feb 21, 2020 16:32The 4 hours of training for AlphaZero corresponds to about 1700 years of computing on a consumer-grade pc (estimate by the author of Leela-Zero, who managed to get this much resources by crowd-sourcing the computation).
I understand why A0 it's so strongh. No way![]()
![Smile :)](./images/smilies/icon_smile.gif)
Re: Damage 15.3
Yes, with resources we can make the impossible. So these 1700 years of computing it's as a kind of database or Just to know what variant it's better.?
-
- Posts: 1722
- Joined: Wed Apr 14, 2004 16:04
- Contact:
Re: Damage 15.3
I think > 90% of the computing time was used to generate self-play games, and the rest of the resources were used to find the optimal weights for the neural network.
-
- Posts: 1368
- Joined: Thu Jun 20, 2013 17:16
- Real name: Krzysztof Grzelak
Re: Damage 15.3
Such information should be treated humorously and joyfully, and do not believe everything.Rein Halbersma wrote: ↑Fri Feb 21, 2020 16:32The 4 hours of training for AlphaZero corresponds to about 1700 years of computing on a consumer-grade pc (estimate by the author of Leela-Zero, who managed to get this much resources by crowd-sourcing the computation).
Re: Damage 15.3
Nice, super computers can calculate Awesome things. I imagine if the calcul was made during 1 month.Rein Halbersma wrote: ↑Fri Feb 21, 2020 16:55I think > 90% of the computing time was used to generate self-play games, and the rest of the resources were used to find the optimal weights for the neural network.
The depth is also incredible. I play also chess, but sometimes A0 does some sacrifices of more than 25 moves.
Re: Damage 15.3
Krzysztof Grzelak wrote: ↑Fri Feb 21, 2020 17:13Such information should be treated humorously and joyfully, and do not believe everything.Rein Halbersma wrote: ↑Fri Feb 21, 2020 16:32The 4 hours of training for AlphaZero corresponds to about 1700 years of computing on a consumer-grade pc (estimate by the author of Leela-Zero, who managed to get this much resources by crowd-sourcing the computation).
![Very Happy :D](./images/smilies/icon_biggrin.gif)
-
- Posts: 1722
- Joined: Wed Apr 14, 2004 16:04
- Contact:
Re: Damage 15.3
You do understand that Google/Deepmind used 1700 years worth of computing *in parallel* in order to achieve all that in 4 hours?Krzysztof Grzelak wrote: ↑Fri Feb 21, 2020 17:13Such information should be treated humorously and joyfully, and do not believe everything.Rein Halbersma wrote: ↑Fri Feb 21, 2020 16:32The 4 hours of training for AlphaZero corresponds to about 1700 years of computing on a consumer-grade pc (estimate by the author of Leela-Zero, who managed to get this much resources by crowd-sourcing the computation).
Re: Damage 15.3
I wanted to know how the strength of the Damage engine scales with the number of learning games used to calculate the weights for the evaluation function.
For this purpose I started several 158 game DXP matches, 2 Min/Game. Further settings as mentioned in the first post.
For the training set I only used the first X (x = 10.000, 20.000, ....) games of the larger set available (1.37M).
Below the results in table and graph format.
These results seem to indicate that for this training set , and this evaluation function saturation starts around 160K games.
Hereafter the curve seems (within statistical fluctuations) more or less flat.
To answer one of the previous questions, if one takes a state of the art ThreadRipper with 32 cores, and assuming that 1 games takes 6 second (based upon 50 ms/move), than it takes around 8 1/2 hours to generate all these 160K games.
The graphs also suggests that Damage is still 5-10 ELO weaker compared with Scan 3.1 (on my machine, with this time setting), although I expected a slightly better result based upon the Damage win in the first match (2 Win, 1Loss, 155 Draw).
For this reason I replayed the match with the evaluation based upon the full training set (1.37M games).
With 1 loss for Damage and 157 draw , and an ELO difference of 2, it more or less confirmed that there are some steps for Damage still to be taken, as Scan still seems to be the better engine.
Also interesting to see that after the initial 2 wins of Damage last weekend, Scan refuses to lose (maybe it learns in a secret way
)
Bert
For this purpose I started several 158 game DXP matches, 2 Min/Game. Further settings as mentioned in the first post.
For the training set I only used the first X (x = 10.000, 20.000, ....) games of the larger set available (1.37M).
Below the results in table and graph format.
Code: Select all
Games W D L U T Games ELO
10000 79 76 0 3 158 0 195
20000 50 108 0 0 158 1 114
40000 28 128 0 2 158 2 63
80000 12 146 0 0 158 3 26
160000 5 151 0 2 158 4 11
320000 8 148 0 2 158 5 18
640000 2 156 0 0 158 6 4
1280000 4 154 0 0 158 7 9
Hereafter the curve seems (within statistical fluctuations) more or less flat.
To answer one of the previous questions, if one takes a state of the art ThreadRipper with 32 cores, and assuming that 1 games takes 6 second (based upon 50 ms/move), than it takes around 8 1/2 hours to generate all these 160K games.
The graphs also suggests that Damage is still 5-10 ELO weaker compared with Scan 3.1 (on my machine, with this time setting), although I expected a slightly better result based upon the Damage win in the first match (2 Win, 1Loss, 155 Draw).
For this reason I replayed the match with the evaluation based upon the full training set (1.37M games).
With 1 loss for Damage and 157 draw , and an ELO difference of 2, it more or less confirmed that there are some steps for Damage still to be taken, as Scan still seems to be the better engine.
Also interesting to see that after the initial 2 wins of Damage last weekend, Scan refuses to lose (maybe it learns in a secret way
![Very Happy :D](./images/smilies/icon_biggrin.gif)
Bert
Re: Damage 15.3
Your last sentence is funny, it seem that Scan refuse effectively to l'oseBertTuyt wrote: ↑Fri Feb 21, 2020 19:33I wanted to know how the strength of the Damage engine scales with the number of learning games used to calculate the weights for the evaluation function.
For this purpose I started several 158 game DXP matches, 2 Min/Game. Further settings as mentioned in the first post.
For the training set I only used the first X (x = 10.000, 20.000, ....) games of the larger set available (1.37M).
Below the results in table and graph format.elo2.pngCode: Select all
Games W D L U T Games ELO 10000 79 76 0 3 158 0 195 20000 50 108 0 0 158 1 114 40000 28 128 0 2 158 2 63 80000 12 146 0 0 158 3 26 160000 5 151 0 2 158 4 11 320000 8 148 0 2 158 5 18 640000 2 156 0 0 158 6 4 1280000 4 154 0 0 158 7 9
These results seem to indicate that for this training set , and this evaluation function saturation starts around 160K games.
Hereafter the curve seems (within statistical fluctuations) more or less flat.
To answer one of the previous questions, if one takes a state of the art ThreadRipper with 32 cores, and assuming that 1 games takes 6 second (based upon 50 ms/move), than it takes around 8 1/2 hours to generate all these 160K games.
The graphs also suggests that Damage is still 5-10 ELO weaker compared with Scan 3.1 (on my machine, with this time setting), although I expected a slightly better result based upon the Damage win in the first match (2 Win, 1Loss, 155 Draw).
For this reason I replayed the match with the evaluation based upon the full training set (1.37M games).
With 1 loss for Damage and 157 draw , and an ELO difference of 2, it more or less confirmed that there are some steps for Damage still to be taken, as Scan still seems to be the better engine.
Also interesting to see that after the initial 2 wins of Damage last weekend, Scan refuses to lose (maybe it learns in a secret way)
Bert
![Very Happy :D](./images/smilies/icon_biggrin.gif)
He has a weight of learning, no ?
So i want to know if your training learning weights as similares to LC0's ?
If it's the case, i think that the 10.000 to 20.000 games are better.
Sidiki
Re: Damage 15.3
Sidiki, LC0 is completely different and uses a deep neural network, nothing compared to what we do.....
Bert
Bert
Re: Damage 15.3
OK, i thought that it's was the same things with the weights. ![👍](//cdn.jsdelivr.net/emojione/assets/3.1/png/64/1f44d.png)
![😎](//cdn.jsdelivr.net/emojione/assets/3.1/png/64/1f60e.png)
Thank
![👍](http://cdn.jsdelivr.net/emojione/assets/3.1/png/64/1f44d.png)
![😎](http://cdn.jsdelivr.net/emojione/assets/3.1/png/64/1f60e.png)
Thank
-
- Posts: 1722
- Joined: Wed Apr 14, 2004 16:04
- Contact:
Re: Damage 15.3
It's both very similar and very different. Both A0 neural networks and Scan inspired patterns have their weights fitted using gradient descent optimization. But the A0 neural networks are much more complicated non-linear functions and the computation of the full eval is many orders of magnitude more expensive. This makes the self-play very slow. Pattern eval functions are very cheap to compute, and are a very effective compromise between a fully general and tunable eval versus a hand-made and hand-tuned eval. It's possible that neural networks can be even stronger than patterns (Elo-wise), but unless someone sets up an AlphaZero type of infrastructure, that's hard to prove.
One other difference is that A0 learned from scratch reinforcement learning after batches of self-play games. Between every batch of games, the eval weights are updated and a new self-play cycle starts until a better version has been found. The pattern tuning for the draughts engines has been done after all self-play games have finished and their critical positions stored. It's possible that reinforcement learning could further improve the current programs.
Re: Damage 15.3
Cool Rein,
It's so the big difference between A0 and our NN patterns.
If we own the same multi computers than Google we must reach this kind of super draughts program also by self learning renforcement.!?
Thank again..
We are waiting for this evening Damage results.
Sidiki
It's so the big difference between A0 and our NN patterns.
If we own the same multi computers than Google we must reach this kind of super draughts program also by self learning renforcement.!?
Thank again..
We are waiting for this evening Damage results.
Sidiki