Hi,
Is there a standard set of test positions for computer draughts like the Bratko-Kopec set for computer chess?
Regards,
Gijsbert
![](https://damforum.nl/bb3/images/ua.png)
Standard set of test positions
-
- Posts: 21
- Joined: Sun Feb 20, 2011 21:04
- Real name: Gijsbert Wiesenekker
-
- Posts: 859
- Joined: Sat Apr 28, 2007 14:53
- Real name: Ed Gilbert
- Location: Morristown, NJ USA
- Contact:
Re: Standard set of test positions
Hi Gijsbert,
Nice to see that you have joined the forum.
I am not familiar with those chess test positions. We have a few test positions that we have been using with perft, to verify the move generator correctness, but I am not aware of any standard set of positions for testing complete engine performance.
When last I heard from you, you were generating the 7-piece dtw database for draughts. How is that going?
-- Ed
Nice to see that you have joined the forum.
I am not familiar with those chess test positions. We have a few test positions that we have been using with perft, to verify the move generator correctness, but I am not aware of any standard set of positions for testing complete engine performance.
When last I heard from you, you were generating the 7-piece dtw database for draughts. How is that going?
-- Ed
Re: Standard set of test positions
There was a test set long ago ![Smile :)](./images/smilies/icon_smile.gif)
Guess it was around 1990 or something like that.
I'm not sure but it might be that Leo Nagels still has the set.
There was also a test report with all programs around that time, which was even published (if my memory is still correct).
Bert
![Smile :)](./images/smilies/icon_smile.gif)
Guess it was around 1990 or something like that.
I'm not sure but it might be that Leo Nagels still has the set.
There was also a test report with all programs around that time, which was even published (if my memory is still correct).
Bert
-
- Posts: 21
- Joined: Sun Feb 20, 2011 21:04
- Real name: Gijsbert Wiesenekker
Re: Standard set of test positions
Hi Ed,
The 4x3 have been calculated and the results are exactly the same as those from Michel Grimminck (http://www.xs4all.nl/~mdgsoft/draughts/stats/index.html) which is good as the algorithm is entirely different. The 5x2 are currently being calculated.
I have been using a set of 289 test positions I got from Klaas Bor (the author of Turbo Dambase) a couple of years ago. GWD declares a position as 'solved' if the score returned from the search is more than half a man score above the static evaluation score of the starting position. I have never checked if all 289 positions meet this criterion for the 'true' solution, but with this criterion GWD solves 276 out of 289 with a time limit of 10 seconds, 2 out of the remaining 13 with a time limit of 30 seconds, 1 out of the remaining 11 with a time limit op 90 seconds, and 1 out of the remaining 10 with a time limit of 270 seconds.
These are the 13 positions that remain after the 10 second search:
[FEN "W:W28,31,32,35,36,37,38,39,40,42,43,45,47:B3,7,8,11,12,13,15,19,20,21,23,26,29."]
[FEN "W:W27,28,32,35,38,40,43,44,45,47,48:B3,8,9,10,11,14,17,19,23,25,29."]
[FEN "W:W16,21,22,23,26,27,28,31,32,38,43,44,47,50:B2,4,6,7,8,9,10,12,13,24,25,30,35."]
[FEN "W:W21,24,29,31,34,36,37,42,48,49:B1,7,9,10,12,13,17,22,26,28."]
[FEN "W:W15,22,27,31,36,40,44,50:B1,2,4,7,19,23,35."]
[FEN "W:W15,20,21,24,27,34,35,40,48,49:B4,8,9,10,11,12,13,18,33,38."]
[FEN "W:W11,16,23,28,29,32,33,34,37,38,40,50:B2,3,4,6,7,9,12,13,15,18,36,45."]
[FEN "W:W12,19,23,24,26,32,40,41,42,44,47,48,50:B3,8,11,13,17,21,28,30,33,35,36,39,43."]
[FEN "W:W26,27,28,32,34,37,38,40,42,44,49,50:B1,7,9,10,11,13,16,19,20,23,24,25."]
[FEN "W:W26,28,30,34,37,39,41,42,43,44,45:B7,8,12,14,17,19,21,24,25,35,36."]
[FEN "W:W18,23,27,33,34,35,36,39,46,47,50:B1,6,7,8,10,17,20,24,25,37,45."]
[FEN "W:W26,32,33,35,38,41,43,49,50:B4,8,14,17,18,24,25,37."]
[FEN "W:W21,27,34,42,46,49:B8,11,16,18,23,24."]
I would be interested to know what your programs think of these positions.
Gijsbert
The 4x3 have been calculated and the results are exactly the same as those from Michel Grimminck (http://www.xs4all.nl/~mdgsoft/draughts/stats/index.html) which is good as the algorithm is entirely different. The 5x2 are currently being calculated.
I have been using a set of 289 test positions I got from Klaas Bor (the author of Turbo Dambase) a couple of years ago. GWD declares a position as 'solved' if the score returned from the search is more than half a man score above the static evaluation score of the starting position. I have never checked if all 289 positions meet this criterion for the 'true' solution, but with this criterion GWD solves 276 out of 289 with a time limit of 10 seconds, 2 out of the remaining 13 with a time limit of 30 seconds, 1 out of the remaining 11 with a time limit op 90 seconds, and 1 out of the remaining 10 with a time limit of 270 seconds.
These are the 13 positions that remain after the 10 second search:
[FEN "W:W28,31,32,35,36,37,38,39,40,42,43,45,47:B3,7,8,11,12,13,15,19,20,21,23,26,29."]
[FEN "W:W27,28,32,35,38,40,43,44,45,47,48:B3,8,9,10,11,14,17,19,23,25,29."]
[FEN "W:W16,21,22,23,26,27,28,31,32,38,43,44,47,50:B2,4,6,7,8,9,10,12,13,24,25,30,35."]
[FEN "W:W21,24,29,31,34,36,37,42,48,49:B1,7,9,10,12,13,17,22,26,28."]
[FEN "W:W15,22,27,31,36,40,44,50:B1,2,4,7,19,23,35."]
[FEN "W:W15,20,21,24,27,34,35,40,48,49:B4,8,9,10,11,12,13,18,33,38."]
[FEN "W:W11,16,23,28,29,32,33,34,37,38,40,50:B2,3,4,6,7,9,12,13,15,18,36,45."]
[FEN "W:W12,19,23,24,26,32,40,41,42,44,47,48,50:B3,8,11,13,17,21,28,30,33,35,36,39,43."]
[FEN "W:W26,27,28,32,34,37,38,40,42,44,49,50:B1,7,9,10,11,13,16,19,20,23,24,25."]
[FEN "W:W26,28,30,34,37,39,41,42,43,44,45:B7,8,12,14,17,19,21,24,25,35,36."]
[FEN "W:W18,23,27,33,34,35,36,39,46,47,50:B1,6,7,8,10,17,20,24,25,37,45."]
[FEN "W:W26,32,33,35,38,41,43,49,50:B4,8,14,17,18,24,25,37."]
[FEN "W:W21,27,34,42,46,49:B8,11,16,18,23,24."]
I would be interested to know what your programs think of these positions.
Gijsbert
-
- Posts: 1722
- Joined: Wed Apr 14, 2004 16:04
- Contact:
Re: Standard set of test positions
Hi Gijsbert,gwiesenekker wrote: I have been using a set of 289 test positions I got from Klaas Bor (the author of Turbo Dambase) a couple of years ago. GWD declares a position as 'solved' if the score returned from the search is more than half a man score above the static evaluation score of the starting position. I have never checked if all 289 positions meet this criterion for the 'true' solution, but with this criterion GWD solves 276 out of 289 with a time limit of 10 seconds, 2 out of the remaining 13 with a time limit of 30 seconds, 1 out of the remaining 11 with a time limit op 90 seconds, and 1 out of the remaining 10 with a time limit of 270 seconds.
A good idea to have a test for search efficiency. I also have a correctness test in my program. From Michel Grimminck's website, I took all the longest winning positions for the endgame database for up to 4 pieces. Below is some sample code for the 2 vs 2 endgame. What it does is search a known database position to a depth equal to the database win length. My search distinguishes scores of win-in-N from longer or shorter wins, so I can do an assert() on the returned search score. My search passes all these correctness tests. With repetition checking turned on, there are some endgames in Killer draughts that give incorrect results, but I haven't experienced that for regular draughts.
Rein
Code: Select all
typedef std::pair<std::string, size_t> DB_unittest;
int value;
DB_unittest DB_win22[] = {
DB_unittest("W:W33,46:B4,5." , 39), // 2020
DB_unittest("W:W8,K50:B3,32." , 27), // 1120
DB_unittest("W:WK1,K23:B4,38." , 25), // 0220
DB_unittest("W:W17,35:B3,K21." , 23), // 2011
DB_unittest("W:WK1,12:B16,K50." , 19), // 1111
DB_unittest("W:WK1,K16:BK17,26." , 19), // 0211
DB_unittest("W:W6,12:BK7,K45." , 7), // 2002
DB_unittest("W:W6,K22:BK17,K50." , 9), // 1102
DB_unittest("W:WK6,K22:BK17,K50.", 9) // 0202
};
for (size_t i = 0; i < 9; ++i) {
value = Root::analyze<Variant::International>(read_position_string<FEN_tag>()(DB_win22[i].first), DB_win22[i].second);
assert(value == Value::win(DB_win22[i].second));
}
-
- Posts: 859
- Joined: Sat Apr 28, 2007 14:53
- Real name: Ed Gilbert
- Location: Morristown, NJ USA
- Contact:
Re: Standard set of test positions
Here are some results with kingsrow using 4 search threads. Search scores are absolute, not relative to side to move. +100 means 1 black man advantage.I would be interested to know what your programs think of these positions.
[FEN "W:W28,31,32,35,36,37,38,39,40,42,43,45,47:B3,7,8,11,12,13,15,19,20,21,23,26,29."]
{No advantage found.}
[FEN "W:W27,28,32,35,38,40,43,44,45,47,48:B3,8,9,10,11,14,17,19,23,25,29."]
{-50, 44-39, 4 seconds}
[FEN "W:W16,21,22,23,26,27,28,31,32,38,43,44,47,50:B2,4,6,7,8,9,10,12,13,24,25,30,35."]
{-248, 21-17, 2 seconds}
[FEN "W:W21,24,29,31,34,36,37,42,48,49:B1,7,9,10,12,13,17,22,26,28."]
{-74, 42-38, 13 seconds}
[FEN "W:W15,22,27,31,36,40,44,50:B1,2,4,7,19,23,35."]
{-120, 40-34, 1 second}
[FEN "W:W15,20,21,24,27,34,35,40,48,49:B4,8,9,10,11,12,13,18,33,38."]
{White db win, 49-43, 1 second}
[FEN "W:W11,16,23,28,29,32,33,34,37,38,40,50:B2,3,4,6,7,9,12,13,15,18,36,45."]
{-288, 50-44, 2 seconds}
[FEN "W:W12,19,23,24,26,32,40,41,42,44,47,48,50:B3,8,11,13,17,21,28,30,33,35,36,39,43."]
{White db win, 23-18, > 10 minutes}
[FEN "W:W26,27,28,32,34,37,38,40,42,44,49,50:B1,7,9,10,11,13,16,19,20,23,24,25."]
{-56, 38-33, 502 seconds}
[FEN "W:W26,28,30,34,37,39,41,42,43,44,45:B7,8,12,14,17,19,21,24,25,35,36."]
{170, 45-40, 1 second}
[FEN "W:W18,23,27,33,34,35,36,39,46,47,50:B1,6,7,8,10,17,20,24,25,37,45."]
{White wins in 39 plies, 6 seconds.}
[FEN "W:W26,32,33,35,38,41,43,49,50:B4,8,14,17,18,24,25,37."]
{db draw, 43-39, < 1 second}
[FEN "W:W21,27,34,42,46,49:B8,11,16,18,23,24."]
{db draw, 27-22, < 1 second}
-- Ed