Damage 15.3

BertTuyt · Post by **BertTuyt** » Fri Feb 14, 2020 20:13

In the next weeks I will try to provide regular updates regarding Damage 15.3.

There might be many programmers, who would like to embark on the ML path, but lack time and/or resources.
For this reason I share my .pdn file for all interested, also to give something back to this community.
It contains 1.368.286 games, Win Draw Lose Unknown = 78663 1209742 74125 5756

The .pdn format is a little different.
As i wanted to have all info i added the captured moves in a capture sequence (which to my knowledge is not pdn 3.0 conform).

The generation of all games took 11 days on a 8 core computer (running at 4 GHz).

Time settings were 50 ms for every game move.
I would especially like to thank Joost Buijs who provided remote access to one of his computers.

See dropbox link for access (1.4 GByte !).

https://www.dropbox.com/sh/4tbkdwhc41ae ... a52ya?dl=0

Bert

BertTuyt · Post by **BertTuyt** » Sat Feb 15, 2020 09:03

I started with the first tests (and further optimizations).

Attached the first DXP Match result against Scan 3.1.
Some settings 2 min/game, 6 cores , machine with an Intel i7-8700K (6 cores at 4.3 GHz).
With this set-up maximum search-speed was 80 MNodes/second.

Damage settings:

Code: Select all

variant = normal
book = 0
threads = 6
tt-shared = 1
tt-size = 24
bb-cache = 5
bb-size = 7
bb-preload = 7
bb-path = S:\

Scan Settings:

Code: Select all

# main

variant = normal
book = true
book-ply = 4
book-margin = 4
threads = 6
tt-size = 24
bb-size = 6

# DXP

dxp-server = true
dxp-host = 127.0.0.1
dxp-port = 27531
dxp-initiator = false
dxp-time = 3
dxp-moves = 75
dxp-board = true
dxp-search = true

Result: 2 win Damage, 1 win Scan, 155 draw.
Need to check if there are no errors in the decisive games scores.
This certainly does not indicate that Damage is better, but that strength differences might be small.

Attached the .pdn file as generated by the Damage GUI (unfortunately will an error in representing the capture).

Bert

BertTuyt · Post by **BertTuyt** » Sun Feb 16, 2020 10:05

In first instance ML seems to be a free lunch. You just take some games, define an evaluation function with specific features, apply an logistic regression optimization, and without any costs you have a world champion level program.
Reality is different, as it took me several months (think 5 to 6) with some sabbatical draughts gaps in between, although Fabien proved that the idea worked, and Ed (thanks Ed, you are a gift to our community) supported me with many suggestions and shared his optimization program with me.

To start with the games, if the input is rubbish, the output is similar. Ed suggested, and I think he is right, that the logistic principle only works, when positional mistakes in the early phases are decisive, and the program is able to convert them into a win/loss (from the white perspective). Which basically means that the program should play better and better while the game progresses. This is most likely the reason that in my case, fixed depth games did not work. Also with a relatively low depth, the pruning is too aggressive, and when analyzing I found that simple combinations where thrown away with the bath water (this is corrected, when you go into hyper depth, like 20+ ply). Next to that when the search reaches the endgame databases, depth implodes, due to the caching mechanism I used for the DB (think some others do the same). Also diversification is a must, playing 1 million time the same game does not help.

So the game set (around 1.3M) now available, is based upon next principles:
* Fixed time for every move, in this case it was 50 ms. Which gives sufficient depth (although I did not monitor this) on a 4 GHZ machine.
* I completely switched off pruning.
* All DBs (6p in this case) were already loaded at engine start, so no need for SSD access during the game.
* Games were stopped when the base position was in the DB (in my case 6 pieces or less, and no capture for either white or black).
* The first 5 moves of every game were played with a material only evaluation, to increase diversity.
* For all moves in the game (also the first 5), the moves at the root were random shuffled.

So far I used the resulting pdn file as is, i did not analyze to what extend duplication of positions and/or complete games occur.
And as I wanted the pdn to be self contained, I added the captured pieces in the move string (which is not pdn compliant).

The Damage Engine was modified so it could play games in parallel in the same Engine, which does not require to load multiple instances. Advantage central sharing of the DB, and only 1 output file. For this reason every search thread used its own hash-table (something you can see and set in the ini file).

I already shared a first result (DXP match against Scan), which looks promising.

Im now focusing on some other (research) questions.
* 1) How does ELO scale with the number of games (as I directly used the pdn file, as is).
* 2) How does ELO scale within a given number of games, but with a different win/loss ratio. In the current set the win/loss ratio is relatively low, and I want to understand what the impact is when it increases to 20%, 30%, 40%, ...

The answers will not be generic, as it is only valid within the constraints of this games set, this logistic regression algorithm (and again credits to Ed for sharing), and the features (complexity) of the evaluation (which were more or less based upon the initial ideas of Fabien). Also the time settings (in the current DXP games 2 minutes / game) and machine capability (I use 6 cores running on 4.3 GHZ) play a role, as with infinite time and infinite resources (like Threadripper with 32-64 cores) ELO differences grow to zero.

Im at the moment running DXP matches for the first question.
So I want to see the effect of an evaluation based upon 10K, 20K, 40K, 80K, 160K, 320K, 640K, and 1280K.

I have the first results now for 10K, and 40K, and today I want to cover 160K (now running), and 640K. If available in time, I will provide intermediate results.

In a next stage I want to understand which evolution features are dominant, and if further evaluation optimization is possible.

Will keep you posted,

Bert

Sidiki · Post by **Sidiki** » Sun Feb 16, 2020 15:06

Thank Bert,
I read your post, and it's interesting. So the idea is to have a perfect or almost perfect program!?
Do you used patterns or only ML ?
I will see the pdn to have an idea of the playing style. I always thought that an agressive style is better to win, this are the case of Kingsrow and Scan also of Damage 7 in the time.
Sidiki.

BertTuyt · Post by **BertTuyt** » Sun Feb 16, 2020 17:47

Sidiki, I use a similar evaluation as Fabien and Ed, and calculate the weights for the generic features (man, king value) and specific patterns (or regions as Fabien also used to call them).

Bert

Krzysztof Grzelak · Post by **Krzysztof Grzelak** » Sun Feb 16, 2020 17:59

Bert and Damage will be using the debut book.

BertTuyt · Post by **BertTuyt** » Sun Feb 16, 2020 18:04

As shared in a previous post Im now testing the impact of the number of learning games (on which the weights file is based upon) on playing strength.
So far I have finalized 158 game matches for 10K, 40K, and 160K learning games.

See table below. in is from the perspective of Scan 3.1.
I did not further analyze the unknown games.

Code: Select all

Games	W	D	L	U	T		Games	ELO
10000	79	76	0	3	158		0	195
20000					0		1	
40000	28	128	0	2	158		2	63
80000					0		3	
160000	5	151	0	2	158		4	11
320000					0		5	
640000					0		6	
1280000					0		7

From this table it seems that around 160 games, one could enter a saturation zone for this evaluation.
Think this was a similar number as Fabien also mentioned.
I now will start a 20K learning games match to better understand the left side slope of the curve.

See also attached graph. Not all data point yet available.

: elo.png (15.43 KiB) Viewed 14218 times

Bert

BertTuyt · Post by **BertTuyt** » Sun Feb 16, 2020 18:07

Kryszstof, if you mean an opening book, Im not sure if I have time to add one, it is on my list of things to do, but maybe not before this tournament.
It will use the Damage endgame DBs, which is 7P.

Bert

Krzysztof Grzelak · Post by **Krzysztof Grzelak** » Sun Feb 16, 2020 18:19

Thank you for your answer Bert.

Sidiki · Post by **Sidiki** » Sun Feb 16, 2020 19:20

BertTuyt wrote: ↑
Sun Feb 16, 2020 18:07
Kryszstof, if you mean an opening book, Im not sure if I have time to add one, it is on my list of things to do, but maybe not before this tournament.
It will use the Damage endgame DBs, which is 7P.

Bert

Great,
I see that you have job "under the hand"
So Damage use learning !?
The program seem to be already strong. The graph you posted, perspective for Scan is't the latest test before the 158 dxp gammes done of after ?

Sidiki

BertTuyt · Post by **BertTuyt** » Sun Feb 16, 2020 19:27

Sidiki, in this graph, I did not include the first data point , which was learning based upon 1.368.286 games.
I mentioned that in a previous post.

This version played 2 Win , 1 Loss, and 155 draw against Scan.
So I assume that this statistics is not enough, I still consider Scan the better program, and especially with smaller time settings, I expect Scan to beat Damage. Maybe I might do that test later.

But at normal tournament timing I guess both are in the same league.

I have still have some ideas to make the evaluation function even faster (not necessary better), and will share results in the weeks to come.

Bert

BertTuyt · Post by **BertTuyt** » Sun Feb 16, 2020 19:31

Sidiki, and to add to my previous post, I do basically nothing new, as JJ also mentioned in another threat.
What I do is building upon the work of many others, and within the Draughts community that is basically Fabien and Ed.
These people should get the rewards and credits.....

So I hope that some of my evaluation ideas work, so others can further built on these.....

Bert

Sidiki · Post by **Sidiki** » Mon Feb 17, 2020 09:32

BertTuyt wrote: ↑
Sun Feb 16, 2020 19:31
Sidiki, and to add to my previous post, I do basically nothing new, as JJ also mentioned in another threat.
What I do is building upon the work of many others, and within the Draughts community that is basically Fabien and Ed.
These people should get the rewards and credits.....

So I hope that some of my evaluation ideas work, so others can further built on these.....

Bert

Bert,
Exactly, you are right, in programming world all it's question of sharing idea and build on all that already exist. It's now the genius of each programmer that give the final product. And you, Fabien, Ed, Jan Jaap and the others are doing it's perfectly.
We can only support you all on these ideas sharing that was missing previously in draughts world. God bless you all.
Also, i already asked this into another post, but is't possible to implemente the learning option to permit to the program even with us to continue learning by the games that we will play against him or against the others programs via dxp protocole ?

Sidiki.

BertTuyt · Post by **BertTuyt** » Mon Feb 17, 2020 10:39

Sidiki,thanks for your post.

Continuous learning as you mentioned is not implemented.
I also doubt if this is really useful.

In the case of Damage the weights file is based upon a little more as 1.3M learning games as input.
The graphs which i started to share indicate, that with the given evaluation function one already reaches saturation, so more games input does not improve the overall program performance.

I'm also not sure if a different (more complex) evaluation, would yield a better result, but this is something I want to test in the next weeks.

Bert

Sidiki · Post by **Sidiki** » Mon Feb 17, 2020 14:29

OK Bert, understood.
We are already happy that "throwing the baby with the bathwater" has been solved. Also with this level reached, Damage is fine.
Just a question, because i see that Damage gained into strenght, what's for you his weakness to must be fixed?
Because i suppose that you aren't yet "happy" of him

World Draughts Forum

Damage 15.3

Damage 15.3

Re: Damage 15.3

Re: Damage 15.3

Re: Damage 15.3

Re: Damage 15.3

Re: Damage 15.3

Re: Damage 15.3

Re: Damage 15.3

Re: Damage 15.3

Re: Damage 15.3

Re: Damage 15.3

Re: Damage 15.3

Re: Damage 15.3

Re: Damage 15.3

Re: Damage 15.3