NNUE
Re: NNUE
Krzysztof,
I will continue to work on Damage.
The recent activities with Scan was to create a proof of concept for the programmers community.
For this year I will work on Damage 16.x, main changes Lazy SMP (in stead of YBWC) for the search and NNUE for the evaluation function.
If you organize a tournament later this year (August - September) I expect the new version of Damage to be available, and I will share it with you, and all who is interested.
Bert
I will continue to work on Damage.
The recent activities with Scan was to create a proof of concept for the programmers community.
For this year I will work on Damage 16.x, main changes Lazy SMP (in stead of YBWC) for the search and NNUE for the evaluation function.
If you organize a tournament later this year (August - September) I expect the new version of Damage to be available, and I will share it with you, and all who is interested.
Bert
Re: NNUE
Yves, thanks for your post.
Also thanks for your nice words, but again I'm not alone, and we should thank all the active contributors in this forum, and active members in the computer draughts community, as we build upon the shoulders of giants.
The nnue concept (Efficiently Updatable Neural Network, spelled backwards) was not totally new, and first developed for Shogi, and later ported to Chess, with a well-known example as Stockfish.
Most (maybe all) strong chess programs now use nnue, as Joost could confirm i guess.
And as the neural network, and toolbox for developing these is quite generic, we can learn a lot from the chess forum (talkchess).
Think, as Rein once posted, that sharing of information is the only way we can make progress together.
A few answers, the file contains the weights for the neural network (nn).
Not sure if you fully understand the concept of nn, otherwise we need to post some more background information.
The playing strength will not improve for a fixed combination of executable and nn, as there is not a feedback mechanism, so the program wont learn when playing.
The same is valid for a hand crafted evaluation (example Sjende Blyn), or a pattern based evaluation (as applied by Scan, Kingsrow and Damage ).
Last but not least, the pattern based implementation is so good and efficient , that I don't believe nnue will surpass on short notice, and if it will become better, only by a small margin (but that's my gutfeel), and not today.
On the other hand, nnue and nn in general is such a hype that we will see many huge advancements in algorithms, toolbox, HW and SW in the coming years.
Bert
Also thanks for your nice words, but again I'm not alone, and we should thank all the active contributors in this forum, and active members in the computer draughts community, as we build upon the shoulders of giants.
The nnue concept (Efficiently Updatable Neural Network, spelled backwards) was not totally new, and first developed for Shogi, and later ported to Chess, with a well-known example as Stockfish.
Most (maybe all) strong chess programs now use nnue, as Joost could confirm i guess.
And as the neural network, and toolbox for developing these is quite generic, we can learn a lot from the chess forum (talkchess).
Think, as Rein once posted, that sharing of information is the only way we can make progress together.
A few answers, the file contains the weights for the neural network (nn).
Not sure if you fully understand the concept of nn, otherwise we need to post some more background information.
The playing strength will not improve for a fixed combination of executable and nn, as there is not a feedback mechanism, so the program wont learn when playing.
The same is valid for a hand crafted evaluation (example Sjende Blyn), or a pattern based evaluation (as applied by Scan, Kingsrow and Damage ).
Last but not least, the pattern based implementation is so good and efficient , that I don't believe nnue will surpass on short notice, and if it will become better, only by a small margin (but that's my gutfeel), and not today.
On the other hand, nnue and nn in general is such a hype that we will see many huge advancements in algorithms, toolbox, HW and SW in the coming years.
Bert
Re: NNUE
From Ed I got the information that the current Scan nnue implementation does not work in hub mode.
I checked that, and he is (as usual) right.
As I mainly play DXP matches with Scan nnue, I only included the dxp option.
So I will fix this during the week, and post an update.
Bert
I checked that, and he is (as usual) right.
As I mainly play DXP matches with Scan nnue, I only included the dxp option.
So I will fix this during the week, and post an update.
Bert
Re: NNUE
Thank again Bert,BertTuyt wrote: ↑Mon Jan 18, 2021 12:40From Ed I got the information that the current Scan nnue implementation does not work in hub mode.
I checked that, and he is (as usual) right.
As I mainly play DXP matches with Scan nnue, I only included the dxp option.
So I will fix this during the week, and post an update.
Bert
I tested scan_31nnue, it play better.
Thanks
Sidiki
-
- Posts: 859
- Joined: Sat Apr 28, 2007 14:53
- Real name: Ed Gilbert
- Location: Morristown, NJ USA
- Contact:
Re: NNUE
Match results, Scan_31nnue vs Scan_31, played on a Xeon W-3245, 2-move start positions, TC 75 moves in 1 minute, books off, 6-piece dbs, 1 search thread.
[ 1]: 0.465 score, 632 games, 0 wins, 44 losses, 587 draws, 1 unk
[ 2]: 0.467 score, 632 games, 1 wins, 42 losses, 586 draws, 3 unk
[ 3]: 0.463 score, 632 games, 2 wins, 49 losses, 580 draws, 1 unk
[ 4]: 0.464 score, 632 games, 0 wins, 45 losses, 586 draws, 1 unk
[ 5]: 0.464 score, 632 games, 0 wins, 45 losses, 587 draws, 0 unk
[ 6]: 0.470 score, 632 games, 0 wins, 38 losses, 594 draws, 0 unk
[ 7]: 0.457 score, 632 games, 2 wins, 56 losses, 572 draws, 2 unk
[ 8]: 0.456 score, 632 games, 2 wins, 57 losses, 571 draws, 2 unk
[ 9]: 0.466 score, 632 games, 1 wins, 44 losses, 587 draws, 0 unk
[10]: 0.466 score, 632 games, 1 wins, 44 losses, 586 draws, 1 unk
[11]: 0.460 score, 632 games, 4 wins, 54 losses, 573 draws, 1 unk
[12]: 0.468 score, 632 games, 0 wins, 40 losses, 592 draws, 0 unk
[13]: 0.468 score, 632 games, 1 wins, 42 losses, 589 draws, 0 unk
[14]: 0.479 score, 632 games, 3 wins, 30 losses, 598 draws, 1 unk
[15]: 0.461 score, 632 games, 0 wins, 49 losses, 583 draws, 0 unk
[16]: 0.475 score, 632 games, 0 wins, 32 losses, 599 draws, 1 unk
total 0.466 score, 10112 games, 17 wins, 711 losses, 9370 draws, 14 unk
elo diff -23.9
[ 1]: 0.465 score, 632 games, 0 wins, 44 losses, 587 draws, 1 unk
[ 2]: 0.467 score, 632 games, 1 wins, 42 losses, 586 draws, 3 unk
[ 3]: 0.463 score, 632 games, 2 wins, 49 losses, 580 draws, 1 unk
[ 4]: 0.464 score, 632 games, 0 wins, 45 losses, 586 draws, 1 unk
[ 5]: 0.464 score, 632 games, 0 wins, 45 losses, 587 draws, 0 unk
[ 6]: 0.470 score, 632 games, 0 wins, 38 losses, 594 draws, 0 unk
[ 7]: 0.457 score, 632 games, 2 wins, 56 losses, 572 draws, 2 unk
[ 8]: 0.456 score, 632 games, 2 wins, 57 losses, 571 draws, 2 unk
[ 9]: 0.466 score, 632 games, 1 wins, 44 losses, 587 draws, 0 unk
[10]: 0.466 score, 632 games, 1 wins, 44 losses, 586 draws, 1 unk
[11]: 0.460 score, 632 games, 4 wins, 54 losses, 573 draws, 1 unk
[12]: 0.468 score, 632 games, 0 wins, 40 losses, 592 draws, 0 unk
[13]: 0.468 score, 632 games, 1 wins, 42 losses, 589 draws, 0 unk
[14]: 0.479 score, 632 games, 3 wins, 30 losses, 598 draws, 1 unk
[15]: 0.461 score, 632 games, 0 wins, 49 losses, 583 draws, 0 unk
[16]: 0.475 score, 632 games, 0 wins, 32 losses, 599 draws, 1 unk
total 0.466 score, 10112 games, 17 wins, 711 losses, 9370 draws, 14 unk
elo diff -23.9
Re: NNUE
Ed, thanks for your test, now we have an accurate baseline to start from..
Think this is already an encouraging result, as we are only at the beginning of nnue.
With this elo, i assume the nnue version would already beat many other programs.
What's your 5 cents?
Bert
Think this is already an encouraging result, as we are only at the beginning of nnue.
With this elo, i assume the nnue version would already beat many other programs.
What's your 5 cents?
Bert
Re: NNUE
Ed, another thing.
During my test (also multi-core) the frequency was around 4.3 GHZ.
As my results were slightly better i assume that your processor was running on a somewhat lower frequency.
But at least your results are backed up by better statistics.
Bert
During my test (also multi-core) the frequency was around 4.3 GHZ.
As my results were slightly better i assume that your processor was running on a somewhat lower frequency.
But at least your results are backed up by better statistics.
Bert
-
- Posts: 859
- Joined: Sat Apr 28, 2007 14:53
- Real name: Ed Gilbert
- Location: Morristown, NJ USA
- Contact:
Re: NNUE
The W-3245 CPU is 3.2 GHz, 16 cores.BertTuyt wrote: ↑Mon Jan 18, 2021 17:51Ed, thanks for your test, now we have an accurate baseline to start from..
Think this is already an encouraging result, as we are only at the beginning of nnue.
With this elo, i assume the nnue version would already beat many other programs.
What's your 5 cents?
Bert
Yes, certainly it is stronger than many programs that Krzysztof runs in his tournaments. Of course a big part of that is that it has the scan search, which is very good, perhaps the best of all draughts programs. As to whether it can eventually do as well as patterns, that's difficult to say. A big unknown in this test is the difference between the training positions you used to generate the nnue weights compared to the training positions that Fabien used for the pattern weights. We know that the quality of the training positions has a big effect on the quality of the weights, and I think some of us are still trying to understand what makes a good set of training positions. In any case this should be a handy platform for experimenting.
-
- Posts: 1368
- Joined: Thu Jun 20, 2013 17:16
- Real name: Krzysztof Grzelak
Re: NNUE
Which Ed Do You Think Is The Stronger Scan - Scan 3.1nnue or Scan 3.1
Re: NNUE
Krzysztof,
the test from Ed, and also my test made clear that Scan is still better than Scan nnue.
As we only start to digging into nnue, there is room for improvement.
I personally don't believe we will surpass (with all other boundary conditions the same) Scan.
But at which speed we will improve this result, we will see in the next months ahead.
Bert
the test from Ed, and also my test made clear that Scan is still better than Scan nnue.
As we only start to digging into nnue, there is room for improvement.
I personally don't believe we will surpass (with all other boundary conditions the same) Scan.
But at which speed we will improve this result, we will see in the next months ahead.
Bert
-
- Posts: 1368
- Joined: Thu Jun 20, 2013 17:16
- Real name: Krzysztof Grzelak
Re: NNUE
I understand and thank you for your answer Bert.
-
- Posts: 1722
- Joined: Wed Apr 14, 2004 16:04
- Contact:
Re: NNUE
I think it would be a good idea to incorporate self-play in a reinforcement learning loop. AFAICS, most current draughts programs generate games from a material only or material + piece-square table only engine and then train a pattern or NNUE eval on that single (possibly very big!) set of training games. It's not much more complicated to write a RL loop around it: generate training games, train the weights, make a new engine, generate more training games etc. etc. Also while training, it would be a good idea to have a tunable "exploration parameter". That could add random noise to your eval, or randomly select an arbitrary move for some small percentage of the positions (tunable by some "epsilon" parameter).Ed Gilbert wrote: ↑Mon Jan 18, 2021 18:45The W-3245 CPU is 3.2 GHz, 16 cores.
Yes, certainly it is stronger than many programs that Krzysztof runs in his tournaments. Of course a big part of that is that it has the scan search, which is very good, perhaps the best of all draughts programs. As to whether it can eventually do as well as patterns, that's difficult to say. A big unknown in this test is the difference between the training positions you used to generate the nnue weights compared to the training positions that Fabien used for the pattern weights. We know that the quality of the training positions has a big effect on the quality of the weights, and I think some of us are still trying to understand what makes a good set of training positions. In any case this should be a handy platform for experimenting.
During the Xmas break, I started working through this book on RL: http://incompleteideas.net/book/RLbook2020.pdf (the first author was David Silver's PhD advisor). In Chapter 16 there is also a nice discussion of game playing programs (backgammon, AlphaGo and even Samuel's checkers program!). The key ideas are again an endless loop that alternates between generating new data and improving the current model with that data, as well as continual exploration during data generation. Chapter 1 starts with a very simple Tic-Tac-Toe example to illustrate these key ideas in self-play learning. I'm gonna change that TTT into a 4x4 or 6x6 draughts example, probably in Python or maybe in C++ using my DCTL library (not sure yet).
One other key problem in RL is how to do "credit assignment". E.g. in a game, the results only come at the end. Most methods just assign the game result to all prior positions, even though the game was only decided by a mistake at the very end. There is also "temporal difference" learning, where each position influences the current weight update (as with gradient descent) proportional to the difference between its own eval and the max eval of its successors. So equal opening positions don't influence the eval weights even if the game is lost in the endgame. In chess, people are now also experimenting with using a combination of the game results and intermediate eval scores as training targets.
Last edited by Rein Halbersma on Tue Jan 19, 2021 14:14, edited 2 times in total.
-
- Posts: 1722
- Joined: Wed Apr 14, 2004 16:04
- Contact:
Re: NNUE
One "worrying" thing about Scan-like patterns is that after ~10-100M positions, their expressive power seems completely exhausted in the sense that the model doesn't improve its goodness of fit anymore when adding new games. My TensorFlow progress curves all flatline no matter what I try. I have found it impossible to overfit on Ed's data. That seems to imply that you can't better results with those type of evals anymore unless you add more patterns. OTOH, more complex evals at least have the potential to potentially be better. Whether that complexity is affordable during search is another matter.BertTuyt wrote: ↑Mon Jan 18, 2021 20:17Krzysztof,
the test from Ed, and also my test made clear that Scan is still better than Scan nnue.
As we only start to digging into nnue, there is room for improvement.
I personally don't believe we will surpass (with all other boundary conditions the same) Scan.
But at which speed we will improve this result, we will see in the next months ahead.
Bert
Re: NNUE
Rein, interesting posts.
I tend to agree that in the end nn or nnue is a better abstraction for an evaluation function.
However the current HW (although improving) imposes a too big nps penalty for nnue, which still favors the Scan-like patterns.
Im however convinced that the balance will change in the next years, especially with progress is self learning frameworks as you described in your first post, and next generation processors which will contain standardized nn-engines.
I get a deja vue feeling, as I thought about bitboards already in the 70-ties, as this was pioneered (to my knowledge) by the chess program Kaissa. But the power of 8-bit processors at that point in time, favored still the traditional mailbox approach for board representation.
So if you want to write the best draughts program in the world, stick to scan-patterns I would propose.
However if you want to prepare for the future, and co-write history, embark on the nn and nnue train.
The good news, the train has left the station already.
Bert
I tend to agree that in the end nn or nnue is a better abstraction for an evaluation function.
However the current HW (although improving) imposes a too big nps penalty for nnue, which still favors the Scan-like patterns.
Im however convinced that the balance will change in the next years, especially with progress is self learning frameworks as you described in your first post, and next generation processors which will contain standardized nn-engines.
I get a deja vue feeling, as I thought about bitboards already in the 70-ties, as this was pioneered (to my knowledge) by the chess program Kaissa. But the power of 8-bit processors at that point in time, favored still the traditional mailbox approach for board representation.
So if you want to write the best draughts program in the world, stick to scan-patterns I would propose.
However if you want to prepare for the future, and co-write history, embark on the nn and nnue train.
The good news, the train has left the station already.
Bert