NNUE

Rein Halbersma · Post by **Rein Halbersma** » Sat Jan 16, 2021 00:49

Joost Buijs wrote: ↑
Fri Jan 15, 2021 23:29

Rein Halbersma wrote: ↑
Fri Jan 15, 2021 22:00

Joost Buijs wrote: ↑
Wed Jan 13, 2021 09:11
In practice 1.4B positions are difficult to handle (one needs fast amounts of memory for this) so I used only 240M.
I have gotten distracted with many other interesting machine learning projects (non-draughts) that kind of had to happen right now before anything else. But I will soon start working again on my Keras/Tensorflow pipeline for Scan-like pattern evals.

One improvement that is possible for any eval training pipeline (NNUE, patterns whatever) using Keras/Tensorflow (or PyTorch for that matter) is to stream the data from disk to the optimizer. That way, you need much less RAM than the entire database. E.g. for Kingsrow, Ed supplied me with a ~10Gb file. Loading that into memory and feeding it to Keras expanded it temporarily to ~48Gb of RAM during optimization. When done from disk in batches, it should be configurable to get it below ~16Gb without much speed loss for *arbitarily* large files on disk.
Maybe a coincidence, I just told Bert this afternoon that I'm busy modifying my pyTorch DataLoader in such a way that it can read chunks of data from disk. On my AMD PC with 128 GB RAM I can load at max. 625M positions in memory because each position is 192 bytes, there is a way to load positions as 192 bits and translate them on the fly to 192 bytes when needed for the training pipeline, but in Python this will be slow. Reading a chunk from disk seems easier, and with SSD it will be fast enough.

In TF you can do pretty neat stuff, like directly streaming over a zipped archive of txt/csv files. You can also write a data preprocessing layer that would apply a function on eg a FEN or a set of bitboards. Then everything is part of the model pipeline, and it becomes much easier to experiment.

https://www.tensorflow.org/tutorials/load_data/csv

BertTuyt · Post by **BertTuyt** » Sat Jan 16, 2021 01:11

The test with multicore seemed to work.
The usual settings 1 min/65 moves (so equal time control), 6p DB, no book, and both 6 cores.

Result KingsRow - Scan nnue (perspective KR):
158 games, 6W, 0L, 152D, Elo difference is 13.

During the weekend i will try to make the Scan nnue source available for all.

Bert

Sidiki · Post by **Sidiki** » Sat Jan 16, 2021 05:23

BertTuyt wrote: ↑
Sat Jan 16, 2021 01:11
The test with multicore seemed to work.
The usual settings 1 min/65 moves (so equal time control), 6p DB, no book, and both 6 cores.

Result KingsRow - Scan nnue (perspective KR):
158 games, 6W, 0L, 152D, Elo difference is 13.

During the weekend i will try to make the Scan nnue source available for all.

Bert

Thanks to all programers, Bert, Ed, Fabien,Joost, Rein and the others for sharing your tests, engines and ideas that help us to increase our draughts levels and have fun by learning from all these engines.

Thanks again, God bless you all.

Sidiki.

Joost Buijs · Post by **Joost Buijs** » Sat Jan 16, 2021 09:11

Rein Halbersma wrote: ↑
Sat Jan 16, 2021 00:49

Joost Buijs wrote: ↑
Fri Jan 15, 2021 23:29

Rein Halbersma wrote: ↑
Fri Jan 15, 2021 22:00

I have gotten distracted with many other interesting machine learning projects (non-draughts) that kind of had to happen right now before anything else. But I will soon start working again on my Keras/Tensorflow pipeline for Scan-like pattern evals.

One improvement that is possible for any eval training pipeline (NNUE, patterns whatever) using Keras/Tensorflow (or PyTorch for that matter) is to stream the data from disk to the optimizer. That way, you need much less RAM than the entire database. E.g. for Kingsrow, Ed supplied me with a ~10Gb file. Loading that into memory and feeding it to Keras expanded it temporarily to ~48Gb of RAM during optimization. When done from disk in batches, it should be configurable to get it below ~16Gb without much speed loss for *arbitarily* large files on disk.
Maybe a coincidence, I just told Bert this afternoon that I'm busy modifying my pyTorch DataLoader in such a way that it can read chunks of data from disk. On my AMD PC with 128 GB RAM I can load at max. 625M positions in memory because each position is 192 bytes, there is a way to load positions as 192 bits and translate them on the fly to 192 bytes when needed for the training pipeline, but in Python this will be slow. Reading a chunk from disk seems easier, and with SSD it will be fast enough.
In TF you can do pretty neat stuff, like directly streaming over a zipped archive of txt/csv files. You can also write a data preprocessing layer that would apply a function on eg a FEN or a set of bitboards. Then everything is part of the model pipeline, and it becomes much easier to experiment.

https://www.tensorflow.org/tutorials/load_data/csv

They use pandas, this should work with pyTorch too. Currently I use numpy.fromfile to read binary files, this is as fast as reading binary files with C++. I prepare the binary data in C++ Python is way to slow for this. I'm thinking about using libTorch and doing everything in C++, the syntax is different though, this would be another learning trajectory.

BertTuyt · Post by **BertTuyt** » Sat Jan 16, 2021 09:39

I also did a DXP match where Scan nnue had twice the game-time compared with KingsRow, 2 min/65 moves, instead of 1 min/65 moves.
Other setting similar, 6 core, 6p DB, no book.

Match result (perspective Kingsrow): 3W, 155D, so Elo difference = 7.

Bert

BertTuyt · Post by **BertTuyt** » Sat Jan 16, 2021 16:41

Herewith the link to the dropbox folder with the sources of Scan nnue.
I will post later some side notes.

I would like to thank several people who without this was not possible.

First of all Jonathan for starting to implement nnue in his checkers program, and I used his code as base , and you can still find some original traces.
Fabien for his work on Scan, and sharing all sources.
Ed for proving the benchmark program KingsRow free of charge.
And finally Joost with whom i exchanged many nnue ideas and thoughts on a daily basis.

I really hope that more people will embark on this nnue adventure, and share findings, thoughts and results, in this forum.

Bert

https://www.dropbox.com/sh/estknh6oqq8i ... -iw-a?dl=0

BertTuyt · Post by **BertTuyt** » Sat Jan 16, 2021 16:50

I also checked the most recent version of Scan nnue.

The match against Kingsrow was with the traditional settings, so 1 min/65 moves, 6p DB, 6 cores no book.

Result (perspective KR) 3W, 0L, 151D, 4U

I analyzed the 4U, with outcome, game 53 draw, game 133 win Scan nnue, game 141 draw and game 147 also draw.

So end result: 3W, 1L, 154D, elo = 4.

Attached KR files,

Bert

Krzysztof Grzelak · Post by **Krzysztof Grzelak** » Sat Jan 16, 2021 17:21

Similar Scan results of the programme how played without nnue.

Sidiki · Post by **Sidiki** » Sun Jan 17, 2021 01:20

BertTuyt wrote: ↑
Sat Jan 16, 2021 16:41
Herewith the link to the dropbox folder with the sources of Scan nnue.
I will post later some side notes.

I would like to thank several people who without this was not possible.

First of all Jonathan for starting to implement nnue in his checkers program, and I used his code as base , and you can still find some original traces.
Fabien for his work on Scan, and sharing all sources.
Ed for proving the benchmark program KingsRow free of charge.
And finally Joost with whom i exchanged many nnue ideas and thoughts on a daily basis.

I really hope that more people will embark on this nnue adventure, and share findings, thoughts and results, in this forum.

Bert

https://www.dropbox.com/sh/estknh6oqq8i ... -iw-a?dl=0

Thank very much Bert,

I will do the feedback.

Sidiki

Sidiki · Post by **Sidiki** » Sun Jan 17, 2021 06:51

BertTuyt wrote: ↑
Sat Jan 16, 2021 16:50
I also checked the most recent version of Scan nnue.

The match against Kingsrow was with the traditional settings, so 1 min/65 moves, 6p DB, 6 cores no book.

Result (perspective KR) 3W, 0L, 151D, 4U

I analyzed the 4U, with outcome, game 53 draw, game 133 win Scan nnue, game 141 draw and game 147 also draw.

So end result: 3W, 1L, 154D, elo = 4.

Attached KR files,

Bert

Hi Bert,

Thank for the sharing, how to activate the nnue file of Scan31_nnue? Or i want to know if the exe file it's already trained. I have all the folder (.vs, x64,scan31_nnue)
What's the utility of the file "nn_20210110.gnn" into folder "data"?

Thanks

Sidiki

BertTuyt · Post by **BertTuyt** » Sun Jan 17, 2021 10:03

Sidiki,

the exe file which you need is in the folder x64/release, scan_31nnue.exe (1/16/2021).
If you already have a working Scan configuration you can just put it in the folder where the original Scan is.

The network (nn_20210110.gnn, 1/10/2021) is in the folder scan_31nnue/data.
You need to add this one to the data folder which is used by your current Scan version.

I assume in your current Scan folder there is a scan.ini file, and the data folder most likely has all other files needed.

Hope this helps,

Bert

BertTuyt · Post by **BertTuyt** » Sun Jan 17, 2021 10:45

Herewith some background information for the programmers.

Files added to Scan: nnue.cpp, nnue.hpp (network routines) and simd.hpp (vector library routines).
In the eval.cpp i changed most files, also reflected in eval.hpp.

The evaluation has an incremental update of the first input layer, and is split in 3 main functions:
* eval_nnue_position_init(), which is called at the start of every new search.
* eval_nnue_position_increment(), updates the first layer after every move in the tree search.
* eval_nnue_position(), calculates the evaluation for the end nodes of the tree search.

For the incremental update I added an accumulator to the Node class (file pos.hpp). This accumulator contains the aligned ( alignas(32) ) intermediate (int16_t) values of the 256 neurons in the first layer.

Code: Select all

class Node {

private:

   alignas(32) int16_t m_accumulator[256]; // nnue include

public:

   int16_t* accumulator() { return m_accumulator; } // nnue include
};

Where i made a (small) change in the other files, i added a comment // nnue include.
You can find in total 16 of these comments, files common.hpp (1) , eval.hpp (4), pos.hpp (2), search.hpp (9)

* common.hpp, changed the version of Scan,

Code: Select all

const std::string Engine_Version {"3.1 nnue"}; // nnue include

* eval.hpp, added Class Node, and the 2 eval_nnue_functions

* pos.hpp, the 2 changes related to the accumulator , see code display above.

*search.hpp, includes for initial ( search() ) and incremental evaluation update ( qs() and search_move() ).

Code: Select all

 Node old_node = node; // nnue include
      Node new_node = node.succ(mv); // nnue include

      eval_nnue_position_increment(0, old_node, new_node); // nnue include

      Score sc = -qs(new_node, -beta, -std::max(alpha, bs), depth - Depth(1), ply + Ply(1), new_pv); // nnue include

Code: Select all

   Node old_node = local.node(); // nnue include
   Node new_node = local.node().succ(mv); // nnue include

   eval_nnue_position_increment(0, old_node, new_node); // nnue include

Code: Select all

   Node duplicate_node = node; // nnue include, to avoid issue with const
   eval_nnue_position_init(duplicate_node); // nnue include

   Search_Global sg;
   sg.init(si, so, duplicate_node, list, bb_size); // nnue include, also launches threads

I struggled with the Node and const Node, so the duplication of the Node in the code was (more or less) a bypass, assume others who master c++ much better than i do will find a more elegant solution.

In a next post I will discuss the eval in some more detail.

Bert

Krzysztof Grzelak · Post by **Krzysztof Grzelak** » Sun Jan 17, 2021 12:47

Sorry to ask what about with Damage.

Rein Halbersma · Post by **Rein Halbersma** » Sun Jan 17, 2021 14:18

Joost Buijs wrote: ↑
Sat Jan 16, 2021 09:11
They use pandas, this should work with pyTorch too. Currently I use numpy.fromfile to read binary files, this is as fast as reading binary files with C++. I prepare the binary data in C++ Python is way to slow for this. I'm thinking about using libTorch and doing everything in C++, the syntax is different though, this would be another learning trajectory.

For the in-memory part of that link, they use Pandas + NumPy. That works fine if you have lots of RAM. The second part of the tutorial explains tf.DataSet which can iterate over a whole directory tree of files.

Sidiki · Post by **Sidiki** » Sun Jan 17, 2021 15:17

BertTuyt wrote: ↑
Sun Jan 17, 2021 10:03
Sidiki,

the exe file which you need is in the folder x64/release, scan_31nnue.exe (1/16/2021).
If you already have a working Scan configuration you can just put it in the folder where the original Scan is.

The network (nn_20210110.gnn, 1/10/2021) is in the folder scan_31nnue/data.
You need to add this one to the data folder which is used by your current Scan version.

I assume in your current Scan folder there is a scan.ini file, and the data folder most likely has all other files needed.

Hope this helps,

Bert

Thanks Bert,

It helped, i saw how it work.
Thank again to you and the others.

Sidiki

World Draughts Forum

NNUE

Re: NNUE

Re: NNUE

Re: NNUE

Re: NNUE

Re: NNUE

Re: NNUE

Re: NNUE

Re: NNUE

Re: NNUE

Re: NNUE

Re: NNUE

Re: NNUE

Re: NNUE

Re: NNUE

Re: NNUE