Internet engine matches
Re: Internet engine matches
Rein, i dont consider this as a milestone.
As this was only a 1 core no pondering match with Kingsrow slightly handicapped by using only a 6p DB (as i dont have the 7p , 8p Kingsrow DB) whereas Damages uses a 7p DB.
Regarding hardware the circumstances were equal.
It only provides a little confidence that some of the ideas (and still being fine tuned , and other still not activated although coded and implemented) seem to work.
And that as a result Damage is improving step by step...
Regarding parallel search, I also have the YBWC implemente , and used in during a previous Dutch Open, but this one contains the older search so i need to do some coding.
Most likely I will face the fact that Ed has a more efficient parallel search implementation, so there the next challenge will start.
Also a 8p DB is basically no rocket science, but it requires computer capacity and time, so i dont expect this one in 2012.
As already posted I will share all these new ideas, and hope others will follow.
If not, I will do it anyway
Bert
As this was only a 1 core no pondering match with Kingsrow slightly handicapped by using only a 6p DB (as i dont have the 7p , 8p Kingsrow DB) whereas Damages uses a 7p DB.
Regarding hardware the circumstances were equal.
It only provides a little confidence that some of the ideas (and still being fine tuned , and other still not activated although coded and implemented) seem to work.
And that as a result Damage is improving step by step...
Regarding parallel search, I also have the YBWC implemente , and used in during a previous Dutch Open, but this one contains the older search so i need to do some coding.
Most likely I will face the fact that Ed has a more efficient parallel search implementation, so there the next challenge will start.
Also a 8p DB is basically no rocket science, but it requires computer capacity and time, so i dont expect this one in 2012.
As already posted I will share all these new ideas, and hope others will follow.
If not, I will do it anyway
Bert
Re: Internet engine matches
As already mentioned in a previous post I try to activate one by one the features which i implemented during my holiday, but did not test so far.
This time i wanted to change 2 parameters at once. Activate the breakthrough routine for FHR and activate IID (Internal Iterative Deepening).
Basically i use 2 implementations for the breakthrough, one routine returns a score and loops through all potential breakthrough man, the other outputs a boolean (true / false) , and returns as soon as a breakthrough is detected. I do the breakthrough check as a condition for FHR (in case the opponent has a breakthrough opportunity i don't apply the FHR mechanism).
Anyway, the IID was a disaster, as already during the 3th game the score exploded to +10 man, in a neutral quiet situation. I couldn't find a clue (but hope this bug is also related to other search instabilities observed). So I decided only to run a match with the new breakthrough routine (now bitmap based in stead of table based) for FHR.
Herewith the match results (conditions as usual):
Damage 14 win, Kingsrow 7 win, Draw 137.
Bert
This time i wanted to change 2 parameters at once. Activate the breakthrough routine for FHR and activate IID (Internal Iterative Deepening).
Basically i use 2 implementations for the breakthrough, one routine returns a score and loops through all potential breakthrough man, the other outputs a boolean (true / false) , and returns as soon as a breakthrough is detected. I do the breakthrough check as a condition for FHR (in case the opponent has a breakthrough opportunity i don't apply the FHR mechanism).
Anyway, the IID was a disaster, as already during the 3th game the score exploded to +10 man, in a neutral quiet situation. I couldn't find a clue (but hope this bug is also related to other search instabilities observed). So I decided only to run a match with the new breakthrough routine (now bitmap based in stead of table based) for FHR.
Herewith the match results (conditions as usual):
Damage 14 win, Kingsrow 7 win, Draw 137.
Bert
Re: Internet engine matches
Good result BertBertTuyt wrote:As already mentioned in a previous post I try to activate one by one the features which i implemented during my holiday, but did not test so far.
This time i wanted to change 2 parameters at once. Activate the breakthrough routine for FHR and activate IID (Internal Iterative Deepening).
Basically i use 2 implementations for the breakthrough, one routine returns a score and loops through all potential breakthrough man, the other outputs a boolean (true / false) , and returns as soon as a breakthrough is detected. I do the breakthrough check as a condition for FHR (in case the opponent has a breakthrough opportunity i don't apply the FHR mechanism).
Anyway, the IID was a disaster, as already during the 3th game the score exploded to +10 man, in a neutral quiet situation. I couldn't find a clue (but hope this bug is also related to other search instabilities observed). So I decided only to run a match with the new breakthrough routine (now bitmap based in stead of table based) for FHR.
Herewith the match results (conditions as usual):
Damage 14 win, Kingsrow 7 win, Draw 137.
Bert
The number of draws seems a bit worringly. I hope this is not going the way of 8x8 checkers, where all games at high level end in draws.
Hope you find the bug i IID
Michel
-
- Posts: 1722
- Joined: Wed Apr 14, 2004 16:04
- Contact:
Re: Internet engine matches
One can always play Killer draughts which has a much lower draw percentage than regular draughts. My engine also supports it so when I'm ready to play engine matches I sure would like to play against DragonKiller.MichelG wrote:
The number of draws seems a bit worringly. I hope this is not going the way of 8x8 checkers, where all games at high level end in draws.
Hope you find the bug i IID
Michel
Rein
Re: Internet engine matches
Michel I recognize your point.
With these differences it is very difficult to prove that you have found an improvement (or it should be something completely disruptive).
Think that Ed already used 3-move ballot matches to compare different versions of Kingsrow.
But I'm afraid that is the consequence of the rules of the Draughts game the draw margin is just too big.
So i can agree with Rein with a killer version differences would become far more visible.
When all programs would use 64bit, bitboards, 8p Endgame DB's (and for storage SSD in stead of HD), multicore processor (or even better dual processor) with parallel search and huge memories (today you can already buy a 2011 socket board with 8 DIMM slots, and 8 GByte for one slot is affordable, which makes 64 GByte, for DB cache , transposition table, ..), then the situation would even be worse.
So far not all programs use these options , but step by step you see more programs as Kingsrow, Damy (think Damy also uses parallel search), Damage, Dragon, Sjende Blyn, Maximus (im not sure if i did forgot one, in case sorry) .
The traditional programs like TD-King, Tornado, DIOS, Dam, X.X, GWD, Dream, Cerberus, (to name a few) should really make this step, otherwise they will loose connection.
A side effect is that the tournaments ,as we have today, more and more become a lottery (in the case all the top programs participate). Even if on program is stronger and would win a 158-games match with (for example) 10 wins 5 losses, 143 draws, and this program has this result with all programs participating, then I'm not sure in the end it will win (I trust Rein could do the math here, how often this program would win out of 100 tournaments).
Most likely the program would win who has the luck that the opponent crashes, and therefore gets the 2 points.
So maybe within a few years we must really think how we want to continue, and where we want to add value to the equation.
We indeed might face a situation as with checkers, where human-machine matches or machine-machine matches i think are really rare events these days.
Also in Computer Chess there seems to be diminishing interest as all are now aware that computers are strong (take the problems the CSVN has to attract and maintain members), and thanks to open communication (which is good !) everyone can assemble a ELO 3000 clone.
The option not to communicate and not to share is for me also not the way to proceed.
Therefore I really support Rein in sharing code and ideas.
Maybe one day we should return to the basics of Artificial Intelligence, not with brute force, but will intelligent programs which learn from mistakes, and have self-improvement capabilities.
But maybe there is a dark horse, a new program (or existing one with disruptive superior new ideas) which will beat anyone. Then at least we have a target to focus on....
Bert
With these differences it is very difficult to prove that you have found an improvement (or it should be something completely disruptive).
Think that Ed already used 3-move ballot matches to compare different versions of Kingsrow.
But I'm afraid that is the consequence of the rules of the Draughts game the draw margin is just too big.
So i can agree with Rein with a killer version differences would become far more visible.
When all programs would use 64bit, bitboards, 8p Endgame DB's (and for storage SSD in stead of HD), multicore processor (or even better dual processor) with parallel search and huge memories (today you can already buy a 2011 socket board with 8 DIMM slots, and 8 GByte for one slot is affordable, which makes 64 GByte, for DB cache , transposition table, ..), then the situation would even be worse.
So far not all programs use these options , but step by step you see more programs as Kingsrow, Damy (think Damy also uses parallel search), Damage, Dragon, Sjende Blyn, Maximus (im not sure if i did forgot one, in case sorry) .
The traditional programs like TD-King, Tornado, DIOS, Dam, X.X, GWD, Dream, Cerberus, (to name a few) should really make this step, otherwise they will loose connection.
A side effect is that the tournaments ,as we have today, more and more become a lottery (in the case all the top programs participate). Even if on program is stronger and would win a 158-games match with (for example) 10 wins 5 losses, 143 draws, and this program has this result with all programs participating, then I'm not sure in the end it will win (I trust Rein could do the math here, how often this program would win out of 100 tournaments).
Most likely the program would win who has the luck that the opponent crashes, and therefore gets the 2 points.
So maybe within a few years we must really think how we want to continue, and where we want to add value to the equation.
We indeed might face a situation as with checkers, where human-machine matches or machine-machine matches i think are really rare events these days.
Also in Computer Chess there seems to be diminishing interest as all are now aware that computers are strong (take the problems the CSVN has to attract and maintain members), and thanks to open communication (which is good !) everyone can assemble a ELO 3000 clone.
The option not to communicate and not to share is for me also not the way to proceed.
Therefore I really support Rein in sharing code and ideas.
Maybe one day we should return to the basics of Artificial Intelligence, not with brute force, but will intelligent programs which learn from mistakes, and have self-improvement capabilities.
But maybe there is a dark horse, a new program (or existing one with disruptive superior new ideas) which will beat anyone. Then at least we have a target to focus on....
Bert
Re: Internet engine matches
One way of determining if it is still possible to improve on the current high-end engines is this way:BertTuyt wrote:Michel I recognize your point.
With these differences it is very difficult to prove that you have found an improvement (or it should be something completely disruptive).
Think that Ed already used 3-move ballot matches to compare different versions of Kingsrow.
But I'm afraid that is the consequence of the rules of the Draughts game the draw margin is just too big.
So i can agree with Rein with a killer version differences would become far more visible.
- Take the best engine available (yours or ed's) and play a match against itself, with one side playing at normal rate (say 15 minutes / game) and the other side at 4-8 times as much time (1-2 hours/game)
If the percentage of draws is still extremely high (90+%), you know there is little to gain by thinking faster. At that point the programs just play near optimal and will in principle still draw frequently against a 40 man database.
Moving to killer might be one idea.
Another field of progress i am thinking about is to win from the human champion. Up to now, attempts seem to have been fairly futile. However i think programs could evolve to play better against humans.
For example the program may have choice between 6 moves, 4 leading to a draw and 2 to a loss. In that case, choose the drawing variant that has the meanest trick in it, for example a 14 ply combination. Even though he can draw after the move, if the grandmaster does not see the 14 ply combination he will lose.
Offer the grandmaster enough opportunity to fall into 14 ply traps (or even deeper), and he will eventually will fall into one some point. Humans play like this, and that works.
Re: Internet engine matches
Michel, thats also what i plan to do, but just the opposite.
In a later stage i want to play with limited time for Damage, and then try to learn from (the many) losses.
I agree that playing with longer time for one side, will determine to what extend faster search will help, but a drawback is that it is so time consuming.
As for statistics you need minimal 158 games, so these events take several days.
Maybe i need to change memory in my computer so i can go towards parallel matches.
In the mean time I found ( i hope) the bug in IID, but the match seems to be winning for KingsRow.
Math result so far: Games played 129. Kingsrow win 7, Damage win 4, Draw 111, and unknown 7 (of the unknown i expect at least 5 draws).
In case the 7 unknown appear to be Draw, than Draw percentage is > 90%.
When also activate the parallel search, maybe it is even worse.
So most likely that is back to the drawing boards to get a clue why....
It could be statistics, or, or, anyway, even if i find a clue, it will take another 158 match to check and verify the new implementation (if any).
Every week i read the column of Ton Sijbrands and test specific positions where one side makes the final mistake.
Quite often the computer sees the mistake mediately (like this week in a 4 - 2 endgame).
But grandmasters are extremely good in creation difficult positions where the opponents most likely will make a mistake.
Think some-one developed some computer approaches for this, if I'm not mistaken it is called opponent modeling.
But my goal first is improving the engine for computer - computer games.
In the mean time I'm also interested in difficult positions where grandmaster (or who-ever) played a brilliant move which proved to be winning in the extreme long term.
So hope some-one can post a few of these...
Bert
In a later stage i want to play with limited time for Damage, and then try to learn from (the many) losses.
I agree that playing with longer time for one side, will determine to what extend faster search will help, but a drawback is that it is so time consuming.
As for statistics you need minimal 158 games, so these events take several days.
Maybe i need to change memory in my computer so i can go towards parallel matches.
In the mean time I found ( i hope) the bug in IID, but the match seems to be winning for KingsRow.
Math result so far: Games played 129. Kingsrow win 7, Damage win 4, Draw 111, and unknown 7 (of the unknown i expect at least 5 draws).
In case the 7 unknown appear to be Draw, than Draw percentage is > 90%.
When also activate the parallel search, maybe it is even worse.
So most likely that is back to the drawing boards to get a clue why....
It could be statistics, or, or, anyway, even if i find a clue, it will take another 158 match to check and verify the new implementation (if any).
Every week i read the column of Ton Sijbrands and test specific positions where one side makes the final mistake.
Quite often the computer sees the mistake mediately (like this week in a 4 - 2 endgame).
But grandmasters are extremely good in creation difficult positions where the opponents most likely will make a mistake.
Think some-one developed some computer approaches for this, if I'm not mistaken it is called opponent modeling.
But my goal first is improving the engine for computer - computer games.
In the mean time I'm also interested in difficult positions where grandmaster (or who-ever) played a brilliant move which proved to be winning in the extreme long term.
So hope some-one can post a few of these...
Bert
Re: Internet engine matches
I finished the match with IID activated.
Kingsrow 11 wins, Damage 4 wins, Draw 143.
So not an overwhelming success
So back to the drawing board.
Possible explanations.
1) There is still a bug adding to search instability.
2) IID pollutes the hash-table.
3) IID consumes time, does not improve the search, and therefore average search depth reduces.
4) The program strength did not chance, and what we see is accepted statistical variations.
Hope anyone out there has a better clue, or has experience with IID....
Bert
Kingsrow 11 wins, Damage 4 wins, Draw 143.
So not an overwhelming success
So back to the drawing board.
Possible explanations.
1) There is still a bug adding to search instability.
2) IID pollutes the hash-table.
3) IID consumes time, does not improve the search, and therefore average search depth reduces.
4) The program strength did not chance, and what we see is accepted statistical variations.
Hope anyone out there has a better clue, or has experience with IID....
Bert
Re: Internet engine matches
Hi Bert,
Of course without knowing in which case exactly you use IID it is quite impossible to help you because a lot of explanations may exist.
My experience is the following: IID can be useful if you are searching a position in which the best move is not obvious otherwise IID may be a lost of time.BertTuyt wrote:I finished the match with IID activated.
Kingsrow 11 wins, Damage 4 wins, Draw 143.
So not an overwhelming success
So back to the drawing board.
Possible explanations.
1) There is still a bug adding to search instability.
2) IID pollutes the hash-table.
3) IID consumes time, does not improve the search, and therefore average search depth reduces.
4) The program strength did not chance, and what we see is accepted statistical variations.
Hope anyone out there has a better clue, or has experience with IID....
Bert
Of course without knowing in which case exactly you use IID it is quite impossible to help you because a lot of explanations may exist.
Gérard
Re: Internet engine matches
Hi Bert,
White to move
The repartition of white men is completely stupid and a grandmaster will certainly conclude that black should probably win the game. Being completely unable to prove this point, this position is perhaps a position leading to an extreme long term win.
BTW Kingsrow evaluation hardly reach 0.20 though a grandmaster will certainly evaluate the position at least to 0.50 or 0.60 or even more?
What is Damage evaluation after let's say 1 or 2 minutes?
I do not know what king of position you are looking for but at least I can give you a type of position on which I am working in order to impove a difficult point of my new eval function.BertTuyt wrote: ...
In the mean time I'm also interested in difficult positions where grandmaster (or who-ever) played a brilliant move which proved to be winning in the extreme long term.
So hope some-one can post a few of these...
Bert
White to move
The repartition of white men is completely stupid and a grandmaster will certainly conclude that black should probably win the game. Being completely unable to prove this point, this position is perhaps a position leading to an extreme long term win.
BTW Kingsrow evaluation hardly reach 0.20 though a grandmaster will certainly evaluate the position at least to 0.50 or 0.60 or even more?
What is Damage evaluation after let's say 1 or 2 minutes?
Gérard
-
- Posts: 190
- Joined: Sun Sep 13, 2009 23:33
- Real name: Jan-Jaap van Horssen
- Location: Zeist, Netherlands
Re: Internet engine matches
Congratulations to Bert and Michel for the latest advances in computer draughts!
Maximus uses "naive" IID -but not for shallow depths- and this seems to work pretty well, better than several restricted IID's I tried. A report on the match Schwarzman-Maximus will appear in the next issue of ICGA Journal, containing a (brief) description of Maximus. If you’re interested and don't subscribe to ICGA Journal, send me an email at janjaapvanhorssen at gmail dot com and I will send you the report once it is cleared.
Concerning forward pruning and search extensions I have a lot of catching up to do, and some ideas to try. Of course one of my goals is to beat Alexander Schwarzman next time...
Jan-Jaap
Maximus uses "naive" IID -but not for shallow depths- and this seems to work pretty well, better than several restricted IID's I tried. A report on the match Schwarzman-Maximus will appear in the next issue of ICGA Journal, containing a (brief) description of Maximus. If you’re interested and don't subscribe to ICGA Journal, send me an email at janjaapvanhorssen at gmail dot com and I will send you the report once it is cleared.
Concerning forward pruning and search extensions I have a lot of catching up to do, and some ideas to try. Of course one of my goals is to beat Alexander Schwarzman next time...
Jan-Jaap
www.maximusdraughts.org
-
- Posts: 190
- Joined: Sun Sep 13, 2009 23:33
- Real name: Jan-Jaap van Horssen
- Location: Zeist, Netherlands
Re: Internet engine matches
Hi Gerard,TAILLE wrote:I do not know what king of position you are looking for but at least I can give you a type of position on which I am working in order to impove a difficult point of my new eval function.
White to move
The repartition of white men is completely stupid and a grandmaster will certainly conclude that black should probably win the game. Being completely unable to prove this point, this position is perhaps a position leading to an extreme long term win.
BTW Kingsrow evaluation hardly reach 0.20 though a grandmaster will certainly evaluate the position at least to 0.50 or 0.60 or even more?
What is Damage evaluation after let's say 1 or 2 minutes?
Maximus static evaluation: -0.49, search score after 2 minutes (laptop, 2 threads): -0.36 (17 ply, move 34-30).
Jan-Jaap
www.maximusdraughts.org
Re: Internet engine matches
Gerard, static Damage evaluation for the position is -0.76 (-1 is down 1 man).
In 2 minutes Damage completes a 24 ply search with no major change in score -0.74, and the move played is 35-30.
Bert
In 2 minutes Damage completes a 24 ply search with no major change in score -0.74, and the move played is 35-30.
Bert
Re: Internet engine matches
JJ,
thanks for your post..
In my case i use for the main loop Iterative Deepening, and in case (with some boundary conditions) there is no valid hash-table move i do a shallow search (Internal Iterative Deepening).
Bert
thanks for your post..
What do you mean with native IID?Maximus uses "naive" IID -but not for shallow depths- and this seems to work pretty well, better than several restricted IID's I tried. A report on the match Schwarzman-Maximus will appear in the next issue of ICGA Journal, containing a (brief) description of Maximus. If you’re interested and don't subscribe to ICGA Journal, send me an email at janjaapvanhorssen at gmail dot com and I will send you the report once it is cleared.
In my case i use for the main loop Iterative Deepening, and in case (with some boundary conditions) there is no valid hash-table move i do a shallow search (Internal Iterative Deepening).
Bert
-
- Posts: 190
- Joined: Sun Sep 13, 2009 23:33
- Real name: Jan-Jaap van Horssen
- Location: Zeist, Netherlands
Re: Internet engine matches
Well, naive IID in the sense that it is used in every interior node where we have no TT best move (and #moves > 1 and say depth-to-go >= 4). Whereas in chess programs IID is only applied in very restricted cases, IIRC.BertTuyt wrote:What do you mean with native IID?
In my case i use for the main loop Iterative Deepening, and in case (with some boundary conditions) there is no valid hash-table move i do a shallow search (Internal Iterative Deepening).
Based on the observation that on average an iterative search is faster than an immediate full search to a certain depth. If it works for the root node, why not for every node? Of course it depends on your move ordering algorithm as well. There are many ways to combine different algorithms.
Jan-Jaap
www.maximusdraughts.org