8 pieces endgame database

Discussion about development of draughts in the time of computer and Internet.
Post Reply
MichelG
Posts: 244
Joined: Sun Dec 28, 2003 20:24
Contact:

Re: 8 pieces endgame database

Post by MichelG »

Rein Halbersma wrote: What kind of memory do you have? There are some articles on Wiki and dedicated computer magazines on the benefits of using ECC RAM. This will catch and correct single bit errors. Owing to cosmic radiation, the base level error rate might be something like 1 bit per day per 1Gb of memory. Of course, if you use ECC RAM, you then might also have to upgrade to a server motherboard and appropriate CPU to actually make use of the more expensive RAM. Perhaps Ed can provide us with more details about his pc setup?
In that computer, it is just plain old ddr2. ECC will probably help prevent some of these errors, but not all. There may be errors on the cpu, busses or on de harddisk/ (Hard disks usually have an error rate of about 10^-16 per bit)

I don't think there is much you can do about it; if you make a very long calculation with a computer, where every swapped bit is essential, you have to do some sort of verification.

The error rate you mention (1 bit/day/GB) can't be right though, otherwise the endgame generator would already have had hundreds of memory failures.

Banks and such have extremely expensive mainframes to prevent this from happening, but at €100000/core i rather run a verification process :-)

Michel
Ed Gilbert
Posts: 862
Joined: Sat Apr 28, 2007 14:53
Real name: Ed Gilbert
Location: Morristown, NJ USA
Contact:

Re: 8 pieces endgame database

Post by Ed Gilbert »

Rein, I have ECC memory in both my build machines. But I agree with Michel that this is no cureall as it does not catch many kinds of errors. It did not prevent two statistics errors from occurring in my counts. Some kind of verify is necessary for any computation that takes such a long time as building these databases. Even the verify that I do has some holes in it. The verify pass is run immediately after the db is built and compressed, and it shares the same cache of successor positions that the build uses. An error in the cache could go undetected.

-- Ed
TAILLE
Posts: 968
Joined: Thu Apr 26, 2007 18:51
Location: FRANCE

Re: 8 pieces endgame database

Post by TAILLE »

Hi,
Ed Gilbert wrote:Rein, I have ECC memory in both my build machines. But I agree with Michel that this is no cureall as it does not catch many kinds of errors. It did not prevent two statistics errors from occurring in my counts. Some kind of verify is necessary for any computation that takes such a long time as building these databases. Even the verify that I do has some holes in it. The verify pass is run immediately after the db is built and compressed, and it shares the same cache of successor positions that the build uses. An error in the cache could go undetected.

-- Ed
I have a (rather long) verification process I used only when testing my multithread generation algorithm on the 6 pieces endgame db. For your information I have EEC memory and I never observed any hardware problem but I know it might happened.
Anyway I do not think Michel or me have to run systematically a verification process. The best verification process is to compare our results with the results of another program.
In case of an error in one db :
1) The probability is very high to obtain different counts for the concerned db
2) If not the probability is very high to have a propagation on another db => different counts for another db
I cannot really imagine a non detected bug in a db if we exchange our counts.
In addition, for this 8 pieces db I intend to run my verification process on the db slices without any kings and, if problem detected, on the db slices with only one king etc.

Of course Ed. was in a very incomfortable position because he was the first having generated the 8 pieces db. For Michel and me it is far more easier !
Gérard
TAILLE
Posts: 968
Joined: Thu Apr 26, 2007 18:51
Location: FRANCE

Re: 8 pieces endgame database

Post by TAILLE »

Hi,

Can we exchange on the size of the 8 pieces db slices ?

I do have for the moment only the following slices available :
0404 : 25 532 Kb
0413 + 1304 : 565 936 Kb
1313 : 1 408 341 Kb
0422 + 2204 : 2 618 003 Kb
1322 + 2213 : 9 248 768 Kb
Gérard
MichelG
Posts: 244
Joined: Sun Dec 28, 2003 20:24
Contact:

Re: 8 pieces endgame database

Post by MichelG »

TAILLE wrote:Hi,
I have a (rather long) verification process I used only when testing my multithread generation algorithm on the 6 pieces endgame db. For your information I have EEC memory and I never observed any hardware problem but I know it might happened.
Anyway I do not think Michel or me have to run systematically a verification process. The best verification process is to compare our results with the results of another program.
In case of an error in one db :
1) The probability is very high to obtain different counts for the concerned db
2) If not the probability is very high to have a propagation on another db => different counts for another db
I cannot really imagine a non detected bug in a db if we exchange our counts.
In addition, for this 8 pieces db I intend to run my verification process on the db slices without any kings and, if problem detected, on the db slices with only one king etc.

Of course Ed. was in a very incomfortable position because he was the first having generated the 8 pieces db. For Michel and me it is far more easier !
I am not so sure about that. If the numbers match, this doesn't necessarily mean that the databases are equal; if there are at least 2 errors in the database they can cancel out in the count.

Admittedly, the chance for seem seems slim, but it can happen in various ways;
- a double memory failure within the same database
- if a byte has a read error, which is mis corrected to give a basically random number and gives you 4 or 8 errors depending on your implementation.
- or in case of a single bit error near the end of a compression-block.

When you are doing 10^17 operations with 10^11 bits or ram, things can go wrong in ways you may not expect.
In any case, if the counts are the same the number of errors in the database must be very small. With 0 and 2 the most likely.

But i don't worry too much about it; even 2 errors in 10^13 positions won't affect playing strength.

Michel
Last edited by MichelG on Wed Sep 29, 2010 08:46, edited 2 times in total.
MichelG
Posts: 244
Joined: Sun Dec 28, 2003 20:24
Contact:

Re: 8 pieces endgame database

Post by MichelG »

TAILLE wrote:Hi,

Can we exchange on the size of the 8 pieces db slices ?

I do have for the moment only the following slices available :
0404 : 25 532 Kb
0413 + 1304 : 565 936 Kb
1313 : 1 408 341 Kb
0422 + 2204 : 2 618 003 Kb
1322 + 2213 : 9 248 768 Kb
Dragon compresses the databases during the build, but it compresses with all capture positions in it. For 0404 this gives 2.3 GB of data.
0404: 2 GB
0413+1304: 20 GB
1313: 40 GB
0422+2204: 131 GB

When everything is done, i will compress without the capture positions and this should give some far better results. I will do that when everything else is finished, but i only intend the use the databases without a lot of kings.

Even so, i think your numbers look very impressive. What kind of compression scheme do you use?

Michel
TAILLE
Posts: 968
Joined: Thu Apr 26, 2007 18:51
Location: FRANCE

Re: 8 pieces endgame database

Post by TAILLE »

MichelG wrote: Dragon compresses the databases during the build, but it compresses with all capture positions in it. For 0404 this gives 2.3 GB of data.
0404: 2 GB
0413+1304: 20 GB
1313: 40 GB
0422+2204: 131 GB
Michel
Up to 7 pieces db I alse store captures in the db in order to speed up the generation process. For the 8 pieces db generation I do not store captures to save disk space (for the db itself) and memory space (for the block indexes).
Anyway we can compare our figures for the 7 pieces db. For Damy they are :
4x3 positions : 13,0 Gb for positions without capture + 40,4 Gb for positions with capture
5x2 positions : 7,2 Gb for positions without capture + 17,5 Gb for positions with capture
6x1 positions : 0,24 Gb for positions without capture + 2,1 Gb for positions with capture

MichelG wrote: Even so, i think your numbers look very impressive. What kind of compression scheme do you use?
Michel
I suspect Damy compression scheme is (very) different from yours. The idea is not to store the results of the positions but to store in a tree the list of positions with a given result.
Taking for example a 4031 type position (with only one king) the principle is to use the first man to build the first level of the tree (that means that the root has 45 successors as a maximum), then you use the second man to build the second level of the tree (44 or 45 successors) etc up to the 7th man. We are now on the leave of the tree and we have to described which positions of the remaining piece (a king in my example) are concerned for being in this db. Here I use several format depending of the number of positions concerned (1 to 43) : one byte with only 1 position, 2 bytes with 2 or 3 positions, etc and 6 bytes if 16 or more positions.
Gérard
Ed Gilbert
Posts: 862
Joined: Sat Apr 28, 2007 14:53
Real name: Ed Gilbert
Location: Morristown, NJ USA
Contact:

Re: 8 pieces endgame database

Post by Ed Gilbert »

Slice sizes:

db0404: 45.7mb
db0413: 863mb
db0422: 3911mb
db1313: 2331mb
db1322: 13669mb

Gerard, are you storing all non-capture WLD data? I thought you had some scheme where you omit one of the 3 values and infer it from it not being one of the other two.

-- Ed
TAILLE
Posts: 968
Joined: Thu Apr 26, 2007 18:51
Location: FRANCE

Re: 8 pieces endgame database

Post by TAILLE »

Ed Gilbert wrote:Slice sizes:

db0404: 45.7mb
db0413: 863mb
db0422: 3911mb
db1313: 2331mb
db1322: 13669mb

Gerard, are you storing all non-capture WLD data? I thought you had some scheme where you omit one of the 3 values and infer it from it not being one of the other two.

-- Ed
Yes Ed, for a given db, for example db 0422, I have two separate db, the win_db containing all winning positions (or non-winning positions if it more interesting) and the lose_db containing all losing positions (or non-losing positions).
Strictly speaking I need one or two db access to have the result of a given position.
In practise however in need always only one db access because I use the MTD(f) based algorithm. As a consequence, depending of testValue, I read only the pertinent db.
As you can see, when I read a disk block from disk, I am able to practically double the number of positions put in memory because it only very seldom happens that I have to use both a positive and a negative testValue for analysing a real game position.
Gérard
Ed Gilbert
Posts: 862
Joined: Sat Apr 28, 2007 14:53
Real name: Ed Gilbert
Location: Morristown, NJ USA
Contact:

Re: 8 pieces endgame database

Post by Ed Gilbert »

Gerard, have you tried the db benchmark to see how your lookup speed compares to Bert's and mine?

-- Ed
TAILLE
Posts: 968
Joined: Thu Apr 26, 2007 18:51
Location: FRANCE

Re: 8 pieces endgame database

Post by TAILLE »

Ed Gilbert wrote:Gerard, have you tried the db benchmark to see how your lookup speed compares to Bert's and mine?

-- Ed
No Ed. because it seems not that relevant for my implementation.
Seeing the "EGTB value" topic on the chess forum I am now convinced that it is hardly a good idea to read block db on disk during a real game.
As a consequence I am working on another implementation based on the following principles :
1) The access to the hard disk will be allowed only at the root and its immediat successors => disk access time has no more importance
2) The most relevant db slices will be put in memory before the beginnig of the game.
Concerning this 2nd point, assuming you accept to calculate the result of a position by accessing the memory db for the concerned position and possibly all its successors then, you can put in the memory the complete 2-7 pieces db in 6 Gb. If this figure is too high you can for the 7 pieces db put in memory only the 7 pieces db slices with at a maximun one king per color; in that case you need only 1,5Gb which is quite reasonnable.

Ed, do you think the 8 pieces db can really be used when you are not very very near the root of the tree ?
Gérard
Ed Gilbert
Posts: 862
Joined: Sat Apr 28, 2007 14:53
Real name: Ed Gilbert
Location: Morristown, NJ USA
Contact:

Re: 8 pieces endgame database

Post by Ed Gilbert »

Seeing the "EGTB value" topic on the chess forum I am now convinced that it is hardly a good idea to read block db on disk during a real game.
...
Ed, do you think the 8 pieces db can really be used when you are not very very near the root of the tree ?
Gerard, I don't know about computer chess, but I am probing the 8 and 9 piece db at all levels in the tree, using a priority function to determine whether to do an unconditional lookup or only lookup a value if it is in cache. It certainly works well, and if you don't do this I think you are wasting most of the power of the the large databases.

-- Ed
TAILLE
Posts: 968
Joined: Thu Apr 26, 2007 18:51
Location: FRANCE

Re: 8 pieces endgame database

Post by TAILLE »

Ed Gilbert wrote:
Seeing the "EGTB value" topic on the chess forum I am now convinced that it is hardly a good idea to read block db on disk during a real game.
...
Ed, do you think the 8 pieces db can really be used when you are not very very near the root of the tree ?
Gerard, I don't know about computer chess, but I am probing the 8 and 9 piece db at all levels in the tree, using a priority function to determine whether to do an unconditional lookup or only lookup a value if it is in cache. It certainly works well, and if you don't do this I think you are wasting most of the power of the the large databases.

-- Ed
Hi Ed.
As you know I have no experience at all in using the 8 pieces db and I have still some months to wait before using it!
I am preparing a lot of tests in order to see what could be the good strategie :
1) At what level of the tree it is interesting to access the disk ?
2) Is it interesting to access the disk when you are deeper in the tree but certainly very near from the PV ?
3) Is it interesting to access the disk depending of the db slice concerned ? I easily imagine that it is practically of no use to access the db for the 2222 db (almost always a draw) or the 4112 db (almost always a win) but it is certainly more useful to access the 1412 db because the result is very uncertain etc.
4) The decision to access the disk may also depend on the estimate value of the root position. If the root position is clearly equal or clearly almost a winning or losing position an access to the disk db during the search would hardly help to find the good move. In the other hand when there is real advantage but not obviously a decisive advantage the an access to the disk may be of great help
5).etc
A lot of interesting tests for the future isn't it ?

For the time being I am looking for the best strategie to put in memory (before the beginning of the game) a significant part of the 7 pieces db.
Gérard
Rein Halbersma
Posts: 1723
Joined: Wed Apr 14, 2004 16:04
Contact:

Re: 8 pieces endgame database

Post by Rein Halbersma »

Ed Gilbert wrote:
Seeing the "EGTB value" topic on the chess forum I am now convinced that it is hardly a good idea to read block db on disk during a real game.
...
Ed, do you think the 8 pieces db can really be used when you are not very very near the root of the tree ?
Gerard, I don't know about computer chess, but I am probing the 8 and 9 piece db at all levels in the tree, using a priority function to determine whether to do an unconditional lookup or only lookup a value if it is in cache. It certainly works well, and if you don't do this I think you are wasting most of the power of the the large databases.

-- Ed
I think the cost-benefit calculation for chess is very different compared to draughts/checkers.
http://kirill-kryukov.com/chess/tablebases-online/

5-pc chess endgames fit within 8 Gb of memory. However, with a few exceptions, the complexity of such endgames is relatively easy and most programs will find a win by searching alone. It's a bit like having the 4-pc draughts endgames in the days when memory was limited to 16M: it doesn't add a lot of strength but it doesn't hurt either.

The 6-pc chess endgames require about 1 Tb of disk space. The complexity of these endgames can be extremely high, with longest win sequences of hundreds of moves. However, most of these endgames are not of much practical revelance. Sizewise, they are comparable to having the complete 8-pc and the incomplete 5m-4m 9-pc draughts database.

As Ed pointed out, without caching and a finely tuned priority function, the search will grind down to a halt. I'm not sure if the chess folks ever invested in developing a good caching system to reduce the disk I/O. It surely is not an easy task as even the Chinook team admitted they didn't use the partial information 6m-5m 11-pc database because of the resulting excessive disk I/O. But I cannot imagine that not using the dbs is ever better than a well-tuned lookup function. There could be diminishing marginal returns, but there should surely not be diminishing returns!
Ed Gilbert
Posts: 862
Joined: Sat Apr 28, 2007 14:53
Real name: Ed Gilbert
Location: Morristown, NJ USA
Contact:

Re: 8 pieces endgame database

Post by Ed Gilbert »

The main benefit of the 8 and 9pc db is in analyzing games. It probably does not make a big difference in elo. When there are for example 14 pieces on the board and you want to know conclusively if the position is a win or draw, you need to probe the databases deep into the search tree. A strong program will usually know the correct move to make, but if it only probes the large db at the root it will not be able to give a conclusive game result. IIRC there are probably some examples in the thread "Help from the 8 pieces endgame database" where the root position is greater than 8 or 9 pieces, the program using a 7pc db could not get a conclusive result, but a conclusive result was obtained using an 8pc db and probing throughout the search.

-- Ed
Post Reply