ENGINE SPEED AND STRONG

Fabien Letouzey · Post by **Fabien Letouzey** » Tue Feb 21, 2017 06:51

Sidiki wrote:I mean, and it's in the same time a question because i am not very sure: to be able to provoque mistake from another engine, our's must be very deeper to play moves that the others engines will don't see that, the move that they played now it's good just for a certain time but really bad in the future.
Only to say that it must have a serious advance of search depth compared to the others.

You don't need to: you can evaluate positions better than the opponent.

Fabien Letouzey · Post by **Fabien Letouzey** » Tue Feb 21, 2017 06:54

Joost Buijs wrote:I never dealt with a game with a very high draw rate like draughts before, it is possible that engines like Kingsrow and Scan are already near or at the highest level reachable. On the other hand I think that you can encode only a small subset of draughts knowledge in the 4x4 patterns that are currently in use by the top engines and that there is still enough room for improvement, time will tell I guess.

If the game is really as dull as it appears to be than I suppose it is time to switch to something else like killer draughts.

Nobody thinks that perfection has been reached; it's only unclear how much of a progress is still possible. I think Bert and perhaps Michel have special interest in finding that out. My guess is about 200 Elo, only 100 of which obtainable with current technology. The rest will have to wait for machines to program themselves, not that I'm longing for it.

Regardless, killer draughts is much richer strategically and my 3x formula gives hope of 600 Elo yet to be discovered.

Fabien Letouzey · Post by **Fabien Letouzey** » Tue Feb 21, 2017 06:55

Joost Buijs wrote:Since the rules of killer draughts are different you need another egdb for it, I believe that Michel is the only one who calculated an egdb for it.

I have tables for 6 pieces. Walter showed interest about a year ago, and I completed the set.

Joost Buijs · Post by **Joost Buijs** » Tue Feb 21, 2017 17:30

Fabien Letouzey wrote:
Joost Buijs wrote:I never dealt with a game with a very high draw rate like draughts before, it is possible that engines like Kingsrow and Scan are already near or at the highest level reachable. On the other hand I think that you can encode only a small subset of draughts knowledge in the 4x4 patterns that are currently in use by the top engines and that there is still enough room for improvement, time will tell I guess.

If the game is really as dull as it appears to be than I suppose it is time to switch to something else like killer draughts.
Nobody thinks that perfection has been reached; it's only unclear how much of a progress is still possible. I think Bert and perhaps Michel have special interest in finding that out. My guess is about 200 Elo, only 100 of which obtainable with current technology. The rest will have to wait for machines to program themselves, not that I'm longing for it.

Regardless, killer draughts is much richer strategically and my 3x formula gives hope of 600 Elo yet to be discovered.

There are a few things that might be improved, larger patterns or patterns that are more in line with draughts theory and it must be possible to get some additional gain out of the search as well.

The engine I'm working on is currently quite comparable to Kingsrow in strength (also with hyper bullet), the last 5 days it improved quite a bit, I tackled many small bugs, and it was also a matter of trial and error testing to find out which things work and which things don't work (the fractional minute timing in the Kingsrow GUI helped quite a bit with this process). When I look at how basic the search still is, plain PVS with a transposition table, killers and history for move sorting, very simple pruning and LMR, this is typically what you see in chess engines of ~2900 Elo. Stockfish is almost 500 Elo stronger, this is achieved by search enhancements which enable it prune very aggressively and to look very far ahead without making large errors. To beat Stockfish with a more standard engine you typically need a 100 to 1 speed advantage, that is quite amazing. I guess there must be something like this possible for draughts as well, not that I expect that it will give you an additional 500 Elo, but like you I expect that 100 to 150 Elo might be possible.

Joost

Catherine · Post by **Catherine** » Tue Feb 21, 2017 20:10

Joost Buijs wrote:
Fabien Letouzey wrote:
Joost Buijs wrote:I never dealt with a game with a very high draw rate like draughts before, it is possible that engines like Kingsrow and Scan are already near or at the highest level reachable. On the other hand I think that you can encode only a small subset of draughts knowledge in the 4x4 patterns that are currently in use by the top engines and that there is still enough room for improvement, time will tell I guess.

If the game is really as dull as it appears to be than I suppose it is time to switch to something else like killer draughts.
Nobody thinks that perfection has been reached; it's only unclear how much of a progress is still possible. I think Bert and perhaps Michel have special interest in finding that out. My guess is about 200 Elo, only 100 of which obtainable with current technology. The rest will have to wait for machines to program themselves, not that I'm longing for it.

Regardless, killer draughts is much richer strategically and my 3x formula gives hope of 600 Elo yet to be discovered.
There are a few things that might be improved, larger patterns or patterns that are more in line with draughts theory and it must be possible to get some additional gain out of the search as well.

The engine I'm working on is currently quite comparable to Kingsrow in strength (also with hyper bullet), the last 5 days it improved quite a bit, I tackled many small bugs, and it was also a matter of trial and error testing to find out which things work and which things don't work (the fractional minute timing in the Kingsrow GUI helped quite a bit with this process). When I look at how basic the search still is, plain PVS with a transposition table, killers and history for move sorting, very simple pruning and LMR, this is typically what you see in chess engines of ~2900 Elo. Stockfish is almost 500 Elo stronger, this is achieved by search enhancements which enable it prune very aggressively and to look very far ahead without making large errors. To beat Stockfish with a more standard engine you typically need a 100 to 1 speed advantage, that is quite amazing. I guess there must be something like this possible for draughts as well, not that I expect that it will give you an additional 500 Elo, but like you I expect that 100 to 150 Elo might be possible.

Joost

Hi Joost,

So a such of elo exist? Stockfish elo's.
I don't know the actual elo of Kingsrow, Scan, Argus , Damage, Damy, Dragon......
Just to ask if it's the fact , naturally in draughts, to have the largest pattern that's abble to increase the elo of an engine and in the same time his strenght?
Or if an particular algorithm is combined to this concept of pattern to permit to engine to use it?

I ask this because, in the past, most programmer used to input brutus force mini-max and then ,alpha beta but now we have LMR and the others news algorithm.

It seem that in the past Ed used an certain algorithm that was different to the most common engine since 2006 to 2012.
And with this concept of pattern, Kingsrow seem to search more deeper than the previous versions.

This question is for Ed and or a person who has an information on it : "Is it the same algorithm used in the past added to the pattern concept or it's a new one? "

Thank

Catherine

BertTuyt · Post by **BertTuyt** » Tue Feb 21, 2017 21:42

Im indeed very interested to understand how much we could grow in Computer Draughts strength.
As Joost pointed out , as the game is quite Drawish (compared with chess) the actual gain might be less than 200 ELO (just an arbitrary metric). Michel and myself spent quite some time in the past, and several posts are related to this.
I will restart some older studies, now with Scan (and hopefully with a modified Argus), to find some new answers.
So Im optimistic that one day we find a new engine (at least) 30 ELO stronger as Scan, and it might be related to further improvements in a better evaluation function......
The quest goes on

Bert

Joost Buijs · Post by **Joost Buijs** » Wed Feb 22, 2017 07:54

Catherine wrote:
Joost Buijs wrote:
Fabien Letouzey wrote: Nobody thinks that perfection has been reached; it's only unclear how much of a progress is still possible. I think Bert and perhaps Michel have special interest in finding that out. My guess is about 200 Elo, only 100 of which obtainable with current technology. The rest will have to wait for machines to program themselves, not that I'm longing for it.

Regardless, killer draughts is much richer strategically and my 3x formula gives hope of 600 Elo yet to be discovered.
There are a few things that might be improved, larger patterns or patterns that are more in line with draughts theory and it must be possible to get some additional gain out of the search as well.

The engine I'm working on is currently quite comparable to Kingsrow in strength (also with hyper bullet), the last 5 days it improved quite a bit, I tackled many small bugs, and it was also a matter of trial and error testing to find out which things work and which things don't work (the fractional minute timing in the Kingsrow GUI helped quite a bit with this process). When I look at how basic the search still is, plain PVS with a transposition table, killers and history for move sorting, very simple pruning and LMR, this is typically what you see in chess engines of ~2900 Elo. Stockfish is almost 500 Elo stronger, this is achieved by search enhancements which enable it prune very aggressively and to look very far ahead without making large errors. To beat Stockfish with a more standard engine you typically need a 100 to 1 speed advantage, that is quite amazing. I guess there must be something like this possible for draughts as well, not that I expect that it will give you an additional 500 Elo, but like you I expect that 100 to 150 Elo might be possible.

Joost
Hi Joost,

So a such of elo exist? Stockfish elo's.
I don't know the actual elo of Kingsrow, Scan, Argus , Damage, Damy, Dragon......
Just to ask if it's the fact , naturally in draughts, to have the largest pattern that's abble to increase the elo of an engine and in the same time his strenght?
Or if an particular algorithm is combined to this concept of pattern to permit to engine to use it?

I ask this because, in the past, most programmer used to input brutus force mini-max and then ,alpha beta but now we have LMR and the others news algorithm.

It seem that in the past Ed used an certain algorithm that was different to the most common engine since 2006 to 2012.
And with this concept of pattern, Kingsrow seem to search more deeper than the previous versions.

This question is for Ed and or a person who has an information on it : "Is it the same algorithm used in the past added to the pattern concept or it's a new one? "

Thank

Catherine

Hi Catherine,

Stockfish is a chess engine and chess Elo is calibrated differently, for instance at chess an IGM/GMI has like 2500 Elo and for draughts it is like 2300. Here in the Netherlands the draughts rating scale is calibrated 850 Elo below that and a GMI has ~1450 Elo on that scale (I don't understand why they did this).

The algorithms you are talking about are actually very old, the latest addition is LMR which became popular 12 years ago but actually is much older than that.

Most chess and draught engines use a principal variation search (PVS), Ed told me Kingsrow uses MTD(f) which was invented in 1994, basically it does the same thing in a somewhat different way, both algorithms use minimax (or negamax) with alpha-beta pruning so the differences are not very large.

Pattern evaluation is not new either, I already used that concept in the eighties in my Othello engine, the big difference is that in the past all weights (evaluation parameters) were tuned by hand with trial and error, since the last ten years tuning the parameters with some form of machine-learning (ML) becomes popular with Fabien being the first to have used it with international draughts.

The size of the patterns can be important, how larger the pattern is the more information you can encode in it, the problem is that the amount of work that you have to do to tune the weights grows very rapidly and becomes undoable above a certain size. Dragon seems to use 6x6 patterns which contain 3^15 different configurations, I wonder how Michel did this.

Joost

Joost Buijs · Post by **Joost Buijs** » Wed Feb 22, 2017 16:41

BertTuyt wrote:Im indeed very interested to understand how much we could grow in Computer Draughts strength.
As Joost pointed out , as the game is quite Drawish (compared with chess) the actual gain might be less than 200 ELO (just an arbitrary metric). Michel and myself spent quite some time in the past, and several posts are related to this.
I will restart some older studies, now with Scan (and hopefully with a modified Argus), to find some new answers.
So Im optimistic that one day we find a new engine (at least) 30 ELO stronger as Scan, and it might be related to further improvements in a better evaluation function......
The quest goes on

Bert

Bert,

There is still something to gain, I can clearly see that the draw-rate is influenced by how the search is tuned, when I tune it passively than the draw-rate will be almost 100%, but when I tune it aggressively the draw-rate gets lower but the win/loss ratio remains about the same, in my view this means that there still is a lack of depth. If you can enhance your search in such a way that it keeps the aggressiveness and the wins that come with it and that it avoids the losses than you will gain strength, and this has nothing to do with evaluation at all.

Evaluation can be improved as well, but it all depends upon whether you want to spend time on it. To be honest there is not much activity going on in this community and this is not very inspiring to say the least.

Joost

Fabien Letouzey · Post by **Fabien Letouzey** » Wed Feb 22, 2017 17:45

BertTuyt wrote:Im indeed very interested to understand how much we could grow in Computer Draughts strength.
As Joost pointed out , as the game is quite Drawish (compared with chess) the actual gain might be less than 200 ELO (just an arbitrary metric). Michel and myself spent quite some time in the past, and several posts are related to this.
I will restart some older studies, now with Scan (and hopefully with a modified Argus), to find some new answers.
So Im optimistic that one day we find a new engine (at least) 30 ELO stronger as Scan, and it might be related to further improvements in a better evaluation function......
The quest goes on

Michel observed that while the number of draws greatly depends on time control, the win/loss ratio remains mostly invariant. If that's true (enough), I nominate "Wilo" as a progress measure that we can exchange. It's a pun on Elo and win/loss that was mentioned in chess; Wilo is simply the Elo formula applied to non-draw games. So you throw away the draws, like for a statistical test, and you compute Elo on the rest. If Michel is right, Wilo will depend little on time. Of course, the measurements will be imprecise since there are so few non-draws; there is no escaping the black hole.

Furthermore, it should be possible to separately develop a draw model with Wilo + time information. I.e. how draws appear with more time (basically nodes per move). I think it's the kind of things that you like to compute as there are very few variables and you can use existing tools.

You also mentioned future experiments with a simplistic eval function, maybe just PST. I'll be busy with Othello until April, but I'll try to squeeze that as a side project (just computing the eval weights).

Fabien Letouzey · Post by **Fabien Letouzey** » Wed Feb 22, 2017 17:54

Joost Buijs wrote:Pattern evaluation is not new either, I already used that concept in the eighties in my Othello engine, the big difference is that in the past all weights (evaluation parameters) were tuned by hand with trial and error, since the last ten years tuning the parameters with some form of machine-learning (ML) becomes popular with Fabien being the first to have used it with international draughts.

The size of the patterns can be important, how larger the pattern is the more information you can encode in it, the problem is that the amount of work that you have to do to tune the weights grows very rapidly and becomes undoable above a certain size. Dragon seems to use 6x6 patterns which contain 3^15 different configurations, I wonder how Michel did this.

Michel used patterns before me. But I think that he viewed them as second-class citizens and used them in addition to his previous evaluation features. My guess is that he ended up with a complexity monster: very good in pure accuracy, but IMO overkill in an engine.

Regarding the number of variables, he probably used hashing (variable hash(v) instead of variable v everywhere). Actually I think it's 5^18 because he also handles kings. Of course, with hashing the total number of variables is mostly irrelevant.

MichelG · Post by **MichelG** » Wed Feb 22, 2017 18:42

Fabien Letouzey wrote: Michel observed that while the number of draws greatly depends on time control, the win/loss ratio remains mostly invariant. If that's true (enough), I nominate "Wilo" as a progress measure that we can exchange. It's a pun on Elo and win/loss that was mentioned in chess; Wilo is simply the Elo formula applied to non-draw games. So you throw away the draws, like for a statistical test, and you compute Elo on the rest. If Michel is right, Wilo will depend little on time. Of course, the measurements will be imprecise since there are so few non-draws; there is no escaping the black hole.

I like the wilo name

I usually look at the win/lose ratio only anyway.

Here is a example match that i ran today between 2 versions of my engine, at 4 seconds per game:
games: 10700, player1 wins :633 player2 wins:731 draw:9336 score:0.991 (+/- 0.0034, -2.7 sigma) player 1 depth:12.348 player 2 depth 12.655, nondraws: 12.7%

Clearly player2 is better, getting 15% more wins and searching somewhat deeper on average. But if you look at elo, it scores 50.4% and that is only a 3 point difference. But in wilo it's 25 and i consider it worth the programming effort it took.

Michel

Joost Buijs · Post by **Joost Buijs** » Wed Feb 22, 2017 19:40

Fabien Letouzey wrote:
Joost Buijs wrote:Pattern evaluation is not new either, I already used that concept in the eighties in my Othello engine, the big difference is that in the past all weights (evaluation parameters) were tuned by hand with trial and error, since the last ten years tuning the parameters with some form of machine-learning (ML) becomes popular with Fabien being the first to have used it with international draughts.

The size of the patterns can be important, how larger the pattern is the more information you can encode in it, the problem is that the amount of work that you have to do to tune the weights grows very rapidly and becomes undoable above a certain size. Dragon seems to use 6x6 patterns which contain 3^15 different configurations, I wonder how Michel did this.
Michel used patterns before me. But I think that he viewed them as second-class citizens and used them in addition to his previous evaluation features. My guess is that he ended up with a complexity monster: very good in pure accuracy, but IMO overkill in an engine.

Regarding the number of variables, he probably used hashing (variable hash(v) instead of variable v everywhere). Actually I think it's 5^18 because he also handles kings. Of course, with hashing the total number of variables is mostly irrelevant.

Taking kings into consideration is probably asking to much, after looking at many games I have the impression that most games are decided way before there are kings on the board, usually the mistake is made very early in the midgame and after this there is no escape anymore. I'm not a draughts player by any means but I can clearly see that the engine sometimes gets into a decaying line that turns out to be a disaster, how to recognize these lines however is not clear to me.

I guess that with hashing you mean that patterns or positions that looks similar are rewarded the same score, this is something I've never considered before.

Joost

Fabien Letouzey · Post by **Fabien Letouzey** » Wed Feb 22, 2017 19:49

Joost Buijs wrote:I guess that with hashing you mean that patterns or positions that looks similar are rewarded the same score, this is something I've never considered before.

I'm talking about the learning process which has no notion of positions, only variables (= weights). So any time you are about to read/write variable v, do it with hash(v) instead. You can view it as random weight sharing.

Rein Halbersma · Post by **Rein Halbersma** » Wed Feb 22, 2017 19:50

Fabien Letouzey wrote:
BertTuyt wrote:Im indeed very interested to understand how much we could grow in Computer Draughts strength.
As Joost pointed out , as the game is quite Drawish (compared with chess) the actual gain might be less than 200 ELO (just an arbitrary metric). Michel and myself spent quite some time in the past, and several posts are related to this.
I will restart some older studies, now with Scan (and hopefully with a modified Argus), to find some new answers.
So Im optimistic that one day we find a new engine (at least) 30 ELO stronger as Scan, and it might be related to further improvements in a better evaluation function......
The quest goes on
Michel observed that while the number of draws greatly depends on time control, the win/loss ratio remains mostly invariant. If that's true (enough), I nominate "Wilo" as a progress measure that we can exchange. It's a pun on Elo and win/loss that was mentioned in chess; Wilo is simply the Elo formula applied to non-draw games. So you throw away the draws, like for a statistical test, and you compute Elo on the rest. If Michel is right, Wilo will depend little on time. Of course, the measurements will be imprecise since there are so few non-draws; there is no escaping the black hole.

Furthermore, it should be possible to separately develop a draw model with Wilo + time information. I.e. how draws appear with more time (basically nodes per move). I think it's the kind of things that you like to compute as there are very few variables and you can use existing tools.

You also mentioned future experiments with a simplistic eval function, maybe just PST. I'll be busy with Othello until April, but I'll try to squeeze that as a side project (just computing the eval weights).

Fabien, you are probably aware that win/loss ratio is also the sole determinant for the statistical measure Likelihood of Superiority (LoS) that is provided by tools like BayesElo (from your compatriot Remi Coulom). So for match outcomes you can disregard draws. It's a nice empirical observation that LoS is invariant under time control.

Rein Halbersma · Post by **Rein Halbersma** » Wed Feb 22, 2017 19:58

MichelG wrote:
Fabien Letouzey wrote: Michel observed that while the number of draws greatly depends on time control, the win/loss ratio remains mostly invariant. If that's true (enough), I nominate "Wilo" as a progress measure that we can exchange. It's a pun on Elo and win/loss that was mentioned in chess; Wilo is simply the Elo formula applied to non-draw games. So you throw away the draws, like for a statistical test, and you compute Elo on the rest. If Michel is right, Wilo will depend little on time. Of course, the measurements will be imprecise since there are so few non-draws; there is no escaping the black hole.
I like the wilo name I usually look at the win/lose ratio only anyway.

Here is a example match that i ran today between 2 versions of my engine, at 4 seconds per game:
games: 10700, player1 wins :633 player2 wins:731 draw:9336 score:0.991 (+/- 0.0034, -2.7 sigma) player 1 depth:12.348 player 2 depth 12.655, nondraws: 12.7%

Clearly player2 is better, getting 15% more wins and searching somewhat deeper on average. But if you look at elo, it scores 50.4% and that is only a 3 point difference. But in wilo it's 25 and i consider it worth the programming effort it took.

Michel

The likelihood of superiority is 99.6017%

World Draughts Forum

ENGINE SPEED AND STRONG

Re: ENGINE SPEED AND STRONG

Re: ENGINE SPEED AND STRONG

Re: ENGINE SPEED AND STRONG

Re: ENGINE SPEED AND STRONG

Re: ENGINE SPEED AND STRONG

Re: ENGINE SPEED AND STRONG

Re: ENGINE SPEED AND STRONG

Re: ENGINE SPEED AND STRONG

Re: ENGINE SPEED AND STRONG

Re: ENGINE SPEED AND STRONG

Re: ENGINE SPEED AND STRONG

Re: ENGINE SPEED AND STRONG

Re: ENGINE SPEED AND STRONG

Re: ENGINE SPEED AND STRONG

Re: ENGINE SPEED AND STRONG