Damage 15.3

BertTuyt · Post by **BertTuyt** » Mon Feb 24, 2020 14:38

I finalized the previous test.
See table below and graph.

Games	W	D	L	U	T	ELO
10000	31	127	0	0	158	69
20000	11	147	0	0	158	24
40000	7	151	0	0	158	15
80000	5	151	0	2	158	11
160000	6	148	1	3	158	11
320000	4	152	0	2	158	9

: elo4.png (34.87 KiB) Viewed 17855 times

The conclusions are the same:
* With a higher win/loss ratio (compared with draw), the learning rate increases (faster learning with fewer games).
* Saturation at similar level (but earlier)
* Damage 15.3 is around 10 ELO points weaker as Scan 3.1

I will now do some tests with an even higher win/loss ratio at 10K games, to find out if there is an optimum.

Bert

jj · Post by jj » Mon Feb 24, 2020 22:44

Interesting to see how filtering most draws changes the curve. We need many more games for significant results but the trend will likely not be different.

Do I understand correctly Bert that your training games were played with the handcrafted Damage evaluation function? I used two batches, the first with material-only evaluation and the second with the evaluation trained from the first batch, so bootstrapping from zero knowledge. I banned forward pruning from the beginning and the first results were already promising, even though I used very fast games (average 0.02 sec/move). A logical next step is reinforcement learning but my first results with that were not very promising so I didn't pursue it further (yet).

Out of curiosity I also tried training the evaluation on 500+K human games but this gives a result worse than training on 512K material-only-evaluation games (about -50 ELO @ 1 sec/move).

To examine the correlation between the number of training games and playing strength I did a similar experiment with 1K...1M training games and three base line evaluations: material-only, Maximus handcrafted, and the 1M version.

: SL.png (12.35 KiB) Viewed 17827 times

This was still with the 4x4 patterns, I haven't repeated the experiment for the 6x4 patterns. It was surprising to see that already with 1K training games the score against (superfast) material-only evaluation was > 80%, and already with 32K training games the evaluation surpassed the handcrafted evaluation of Maximus.

I have been working on other things since then but I definitely like to return to this subject.

Jan-Jaap

P.S. Concerning the black hole (all draws), there is always 12x12 and 14x14 draughts.

Fabien Letouzey · Post by **Fabien Letouzey** » Tue Feb 25, 2020 08:40

jj wrote: ↑
Mon Feb 24, 2020 22:44
Do I understand correctly Bert that your training games were played with the handcrafted Damage evaluation function? I used two batches, the first with material-only evaluation and the second with the evaluation trained from the first batch, so bootstrapping from zero knowledge.

What I usually do is first build a PST. No kings, balance, or game phase; just material + man position. It's very fast to learn, and doesn't require many examples.

I observe that PST evals are about half way between material and patterns (which I view as higher-order PSTs), making them a good tool for experiments. Lidraughts uses PSTs for weak Scan levels (1-5 for most variants).

Fabien.

BertTuyt · Post by **BertTuyt** » Tue Feb 25, 2020 19:06

Jan-Jaap,

regarding your question your training games were played with the handcrafted Damage evaluation function?

Nope, I used a simplified pattern-based evaluation function.
As Ed gave me access to his optimization routines, it also revealed (for me) the format of his weights.bin file.
Therefore I could use this (with only a very basic evaluation) to generate games (I leave it up to Ed if he would like to provide further details).
Like you I switched off forward pruning.

In this way I was able to quickly come to decent results, although it is still my intend to follow a parallel path via bootstrap learning (based upon material only) like you did.

I will run another DXP match simulation today, to measure the effect of decreasing draw-rate.

This weekend I will start to derive and test different evaluations functions from my data set.

Bert

BertTuyt · Post by **BertTuyt** » Wed Feb 26, 2020 18:27

Herewith some results with an decreasing draw rate, learning set 10K games.

Code: Select all

Games	W	D	L	U	T	E
10000	79	76	0	3	158	195
10000	31	127	0	0	158	69
10000	22	135	0	1	158	49
10000	25	132	0	1	158	56

The non-draw percentage in the learning set:

Code: Select all

%	Elo
11.32%	195
28.25%	69
44.48%	49
54.72%	56

And the graph:

: elo5.png (16.23 KiB) Viewed 17705 times

Small indication that we are in the neutral zone where further improvement is not possible, or already in the diminishing returns phase.
Will do an additional test.

Bert

jj · Post by jj » Wed Feb 26, 2020 21:16

Fabien Letouzey wrote: ↑
Tue Feb 25, 2020 08:40
What I usually do is first build a PST. No kings, balance, or game phase; just material + man position. It's very fast to learn, and doesn't require many examples.

I observe that PST evals are about half way between material and patterns (which I view as higher-order PSTs), making them a good tool for experiments. Lidraughts uses PSTs for weak Scan levels (1-5 for most variants).

Interesting. That is also a useful application, beginner levels. I want to revisit this subject for my app update.

Jan-Jaap

BertTuyt · Post by **BertTuyt** » Fri Mar 06, 2020 13:23

Herewith the last update of my test.
Also here saturation mode, although I expect that with an even higher win/lose rate, the curve will go upwards again.
But will not further test this.

Code: Select all

Games	W	D	L	U	T	E
10000	79	76	0	3	158	195
10000	31	127	0	0	158	69
10000	22	135	0	1	158	49
10000	25	132	0	1	158	56
10000	22	133	0	3	158	50

: elo6.png (16.96 KiB) Viewed 17516 times

Bert

Sidiki · Post by **Sidiki** » Sat Mar 07, 2020 10:48

BertTuyt wrote: ↑
Fri Mar 06, 2020 13:23
Herewith the last update of my test.
Also here saturation mode, although I expect that with an even higher win/lose rate, the curve will go upwards again.
But will not further test this.
Code: Select all
Games	W	D	L	U	T	E
10000	79	76	0	3	158	195
10000	31	127	0	0	158	69
10000	22	135	0	1	158	49
10000	25	132	0	1	158	56
10000	22	133	0	3	158	50
elo6.png

Bert

Hi Bert,
It's seem that things are going better.

Sidiki

BertTuyt · Post by **BertTuyt** » Sat Mar 21, 2020 15:45

I'm now finalizing a test with the latest engine.
I expect this engine to be in the order of 20 ELO points weaker compared with Scan and Kingsrow (with a 158 game DXP Match 2 min/game).
So around 10 expected Losses and 148 Draw (based upon my 6 cores 4.3 GHZ system).
But maybe with zillions of cores (the program should handle the 16 cores of the Thread Ripper), results improve slightly.

I will post a version (including GUI) tomorrow (= Sunday), with some instructions.
Feedback welcomed, as test results (DXP match 2 min/game) against Kingsrow, Scan, Dragon, Moby Dam, Sjende Blyn, Maximus, .....

As less is more (although not always better) I'm now using the absolute minimal evaluation function, see code below.

Code: Select all

int CEvaluation::Eval_Position(position_t* position)
{
	bitboard_t wm = position->wman(), bm = position->bman();

	score_t score = sweights.property[MATL_MEN] * (position->countbman() - position->countwman());	
	score += sweights.property[MATL_KING] * (position->countbking() - position->countwking());

	score += pat12(0, bm, wm, MASK12_PATTERN0, MASK12_PATTERN7); // Patterns 0, 7
	score += pat12(1, bm, wm, MASK12_PATTERN1, MASK12_PATTERN6); // Patterns 1, 6
	score += pat12(2, bm, wm, MASK12_PATTERN2, MASK12_PATTERN5); // Patterns 2, 5
	score += pat12(3, bm, wm, MASK12_PATTERN3, MASK12_PATTERN4); // Patterns 3, 4

	int eval = (40 * score.m_eg + position->countallpiece() * (score.m_mg - score.m_eg)) / 2000;

	return (position->bturn() == true ? -eval : eval);
}

So the standard material man and king (although not with the Scan split in 2 weights sets).
In addition the 4 patterns.
For the game phase (to interpolate between mid- and end-game, I use the piece-count, which is a little faster).

So really minimal in comparison with (for example) Sjende Blyn (so curious how Damage scores against SB).

Keep you posted,....

Bert

Krzysztof Grzelak · Post by **Krzysztof Grzelak** » Sat Mar 21, 2020 20:35

Thank you for the information Bert. How do I download the program on Monday afternoon.

BertTuyt · Post by **BertTuyt** » Sat Mar 21, 2020 22:21

I will give a download link.

Bert

Krzysztof Grzelak · Post by **Krzysztof Grzelak** » Sat Mar 21, 2020 23:49

Thank you Bert.

BertTuyt · Post by **BertTuyt** » Sun Mar 22, 2020 00:16

Herewith some background of the download.
Included is the Damage GUI, Engine, and the 7p DB (around 16.9 GB).
Also all should work out of the box, just start Damage2020.
It could be that additional Microsoft installs are needed, but we will find out soon.

The Damage GUI is basically a bug factory

, I have many options which were never finalized nor tested.
I'm also converting towards a HUB based GUI, so not all might work.
For now I would limit to the DXP option, which i will explain later.

The engine is also work in progress, there is a strange crash which does not occur often, but the bug is still hidden.
Nevertheless I wanted to share with you, so all is ready for the next tournament.

Some details:

GUI startup:

Code: Select all

boardsize = 10
engine = damageengine153.exe
protocol = guide

Dont change the boardsize = 10, I used this option while working on 8x8 draughts.
The engine should be installed in the Engines directory.
The protocol for the damagengine is guide, but im working to cover also the Hub protocol.

When the GUi started, you should also see the damage engine, and if all is ok, you should see next text in the Info Window.

: infowindow.PNG (2.89 KiB) Viewed 15847 times

The damage.ini contains configuration settings

Code: Select all

variant = normal
book = 0
threads = 6
tt-shared = 1
tt-size = 25
bb-cache = 5
bb-size = 7
bb-preload = 7

I was working also on a breakthrough variant, but this is not supported in the 15.3 version
Threads = 6, you can increase the threads, based upon actual cores. For those who have a 16 core machine, I suggest to start with 8 cores, and then increase to 16.
tt-size, number of entries ( 2 ^25) in the hash-table, every entry is 16 bytes.
bb-cache, is the endgame db cache in 4 Gbyte, so with 5 (20 GByte) the whole DB can be loaded in memory, this is the preferred setting if memory is no problem.
bb-size, is the number of max pieces from the DB
bb-preload, which part is preloaded in memory (in this case all).

As I did not test all, I would limit the GUI use to the DXP functionality.

: dxp function.PNG (9.98 KiB) Viewed 15847 times

Before one starts the DXP one should start the opponent program in a wait for DXP connection mode.
I tested this so far with Scan and Kingsrow.

With options you can set the parameters for a DXP match, see below settings for a 2-move ballot Match 2 min/game.

: options.PNG (17.43 KiB) Viewed 15847 times

To display the DXP Match status, toggle info.
Then push the make button (on the left).
The connection should now be established, and visualized in the info window.

: info2.PNG (9.21 KiB) Viewed 15847 times

Then you can start the DXP Match with the Run button.

Hope it works, and waiting for any feedback, suggestions, bugs,.......

Herewith the link: https://www.dropbox.com/sh/pxqa6zc8bu5o ... WOqua?dl=0

Bert

Yves · Post by **Yves** » Sun Mar 22, 2020 09:36

Hello Bert,
Thanks for seeing the problem.
Attached image with the file in French.
Sincerely Yves,

BertTuyt · Post by **BertTuyt** » Sun Mar 22, 2020 09:52

Yves, thanks for your email.
For my understanding, you see this message during initial startup?
How much memory does your system have?

Bert

World Draughts Forum

Damage 15.3

Re: Damage 15.3

Re: Damage 15.3

Re: Damage 15.3

Re: Damage 15.3

Re: Damage 15.3

Re: Damage 15.3

Re: Damage 15.3

Re: Damage 15.3

Re: Damage 15.3

Re: Damage 15.3

Re: Damage 15.3

Re: Damage 15.3

Re: Damage 15.3

Re: Damage 15.3

Re: Damage 15.3