EndGame Databases

Discussion about development of draughts in the time of computer and Internet.
Post Reply
64_bit_checkers_engine
Posts: 62
Joined: Mon Apr 20, 2009 01:10

Post by 64_bit_checkers_engine »

Rein Halbersma wrote:
TAILLE wrote:
Rein Halbersma wrote:For the larger databases (>6 pieces), they have for every database subdivision (wm, wk, bm, wk, wr, br tuple) a plain text index file with for every 4K block of data in the database the 32-bit index of the position that starts the block. Suppose you build the full 8 pc databases and ultimately have a 512 Gb database on disk. That means 128M blocks of indices, 4 bytes per starting index so about 512 Mb of indexing data to be in RAM at all times.
How do you manage to have only 4 bytes for the starting index in the case of the 8 pieces db ? Each slices of this db have far more than 4G positions.
This is how Ed describes it:
http://pages.prodigy.net/eyg/Checkers/10-pieceBuild.htm

"subdivide each slice into fixed size subdivisions of size 2^32. To find which subdivision of a slice a position is in, the 64-bit slice index is computed for the position, and this number is divided by 2^32. The integer quotient identifies which subdivision contains the index, and the 32-bit remainder from the division becomes the index of the position within that subdivision."

So you use 64-bit indices during building, but 32-bit indices in the index file.
Actually, that is how I described it to Ed Glibert, and my name is Ed too, so your comment is correct

[img]images/smilies/icon_smile.gif[/img]

However, Ed Gilbert eventually used 2^31 instead of 2^32 for reasons he mentioned later.
BertTuyt
Posts: 1592
Joined: Wed Sep 01, 2004 19:42

Post by BertTuyt »

I have a specific issue with my attempt to add a database cache in my program, in stead of reading all files at the start.

What I now do is to open ( fopen() )all sub-databases (so including all sliced sun-versions) at program start, and store all pointers in an array, for the 6p databases these are 3195 files !!

In a later stage I want to read a specific datablock (4KByte) on demand,
by moving the pointer with the fseek(), and then do a fread().

Im now confronted with a limitation of opened files in Windows.
Apparently this limit is 512 and can be set to 2048, but thats the upper maximum.

Im curious if other programmers also open "zillions" of files, and have developed bypasses to deal with this.

Note: A bypass, is to open the file every time i want to read "another datablock, but im afraid of the time penalty involved.

Bert
TAILLE
Posts: 968
Joined: Thu Apr 26, 2007 18:51
Location: FRANCE

Post by TAILLE »

BertTuyt wrote:I have a specific issue with my attempt to add a database cache in my program, in stead of reading all files at the start.

What I now do is to open ( fopen() )all sub-databases (so including all sliced sun-versions) at program start, and store all pointers in an array, for the 6p databases these are 3195 files !!

In a later stage I want to read a specific datablock (4KByte) on demand,
by moving the pointer with the fseek(), and then do a fread().

Im now confronted with a limitation of opened files in Windows.
Apparently this limit is 512 and can be set to 2048, but thats the upper maximum.

Im curious if other programmers also open "zillions" of files, and have developed bypasses to deal with this.

Note: A bypass, is to open the file every time i want to read "another datablock, but im afraid of the time penalty involved.

Bert
I have not this problem with Damy. For the 6p db I have only 2 files, one if it white to play and one if is black to play.
For the 7p db I built 76 files (2 files per slice). With the same approach I intend to add 78 files for the 8p db.
Gérard
Ed Gilbert
Posts: 860
Joined: Sat Apr 28, 2007 14:53
Real name: Ed Gilbert
Location: Morristown, NJ USA
Contact:

Post by Ed Gilbert »

Bert, when building databases I open a file when a lookup is needed and then keep the open file handle around for a maximum of a few days. If it is not used after a few days I close it. I don't know how many files I have open simultaneously but I'm sure that at times it is at least a few thousand. I do not use fopen. I use CreateFile, SetFilePointer, ReadFile, and CloseHandle.

-- Ed
Ed Gilbert
Posts: 860
Joined: Sat Apr 28, 2007 14:53
Real name: Ed Gilbert
Location: Morristown, NJ USA
Contact:

Post by Ed Gilbert »

Bert, the program I use to build dbs for 8x8 checkers leaves the files open for a longer period of time, and there I have had ~30k file handles open. Also, the "few thousand" figure I mentioned is for each instance of the builder. On the 8-core machine there were 8 instances running, each with a few thousand open file handles.

-- Ed
64_bit_checkers_engine
Posts: 62
Joined: Mon Apr 20, 2009 01:10

Post by 64_bit_checkers_engine »

BertTuyt wrote: Im now confronted with a limitation of opened files in Windows.
Apparently this limit is 512 and can be set to 2048, but thats the upper maximum.

Bert
This sounds familiar. I only use a maximum of 400 files for all 10-piece database slices in 8x8 checkers, but I encountered it elsewhere on another project I was working on at the time.

I think the "problem" is the "Windows disk cache" setting. There is something, somewhere, under one of those endless "tabs" on a "properties" window you get by right-clicking on your hard drive icon.

Find it. And then turn off windows' disk cache.
64_bit_checkers_engine
Posts: 62
Joined: Mon Apr 20, 2009 01:10

Post by 64_bit_checkers_engine »

BertTuyt wrote: What I now do is to open ( fopen() )
I also remember reading that "fopen()" is now one of those "old-time" function calls, and it has been made "obsolete" but still able to be accessed.

I forget what the name of the new routine is, I think fopen64(). Same goes for fclose() and fseek(), etc.
BertTuyt
Posts: 1592
Joined: Wed Sep 01, 2004 19:42

Post by BertTuyt »

Ed,

I was also looking at the CreateFile functions, but i wanted to be sure, that there was not another alternative.
In the mean time im re-writing my program , and replacing the traditional fopen/fread/fwrite/flcose.

I have still an open question:

* Is it possible to address large files with thes functions ( so > 2 Gigabyte). For now the files which I have are smaller, but if I want to merge all the sliced sub-databases (most advanced rank of white/black man), into a single file (don't even know if this is really needed), i might go beyond the 4 GigaByte for a single file.

Did you also encounter this problem/issues, and if so what was your approach?

Bert
Ed Gilbert
Posts: 860
Joined: Sat Apr 28, 2007 14:53
Real name: Ed Gilbert
Location: Morristown, NJ USA
Contact:

Post by Ed Gilbert »

Is it possible to address large files with thes functions ( so > 2 Gigabyte).
Yes, certainly. Many of my db files are much larger than 2gb. For example, the file for 2bm 3bk 1wm 2wk is 44 gb. SetFilePointer takes a 64-bit file offset.

-- Ed
Rein Halbersma
Posts: 1722
Joined: Wed Apr 14, 2004 16:04
Contact:

new bitboard layout

Post by Rein Halbersma »

Code: Select all

{
           0,  1,  2,  3,  4,
         6,  7,  8,  9, 10, 
          13, 14, 15, 16, 17,
        19, 20, 21, 22, 23, 
          26, 27, 28, 29, 30,
        32, 33, 34, 35, 36, 
          39, 40, 41, 42, 43,
        45, 46, 47, 48, 49, 
          52, 53, 54, 55, 56,
        58, 59, 60, 61, 62
}  
I have been redesigning my engine-to-be. Part of my design goal is to have a generic move generator / board layout (using C++ templates) for all the various draughts variants played around the world.

The above bitboard layout enables orthogonal captures for Frisian draughts. For those who know the ghost squares trick it's fairly obvious. For those who don't: not displayed are 2 columns at each side of the board (i.e. squares 5, 11, 12, 18, 24, 25, 37, 38, 44, 50, 51, 57). They eliminate edge detection and even/odd row distinction during move generation/evaluation.

It's just so nice that you have 1 bit to spare with 64-bit integers! It might also be handy for international draughts as you can now more easily detect horizontal patterns without the need for explicit edge detection.
TAILLE
Posts: 968
Joined: Thu Apr 26, 2007 18:51
Location: FRANCE

Re: new bitboard layout

Post by TAILLE »

[quote="Rein Halbersma"]

Code: Select all

{
           0,  1,  2,  3,  4,
         6,  7,  8,  9, 10, 
          13, 14, 15, 16, 17,
        19, 20, 21, 22, 23, 
          26, 27, 28, 29, 30,
        32, 33, 34, 35, 36, 
          39, 40, 41, 42, 43,
        45, 46, 47, 48, 49, 
          52, 53, 54, 55, 56,
        58, 59, 60, 61, 62
}  
For your information my coding is the following, with 11 spare bits I use for other purpose.

Code: Select all

{
           0,  1,  2,  3,  4,
         5,  6,  7,  8, 9, 
          11, 12, 13, 14, 15,
        16, 17, 18, 19, 20, 
          22, 23, 24, 25, 26,
        27, 28, 29, 30, 31, 
          33, 34, 35, 36, 37,
        38, 39, 40, 41, 42, 
          44, 45, 46, 47, 48,
        49, 50, 51, 52, 53
}  
Gérard
Rein Halbersma
Posts: 1722
Joined: Wed Apr 14, 2004 16:04
Contact:

Re: new bitboard layout

Post by Rein Halbersma »

TAILLE wrote: For your information my coding is the following, with 11 spare bits I use for other purpose.

Code: Select all

{
           0,  1,  2,  3,  4,
         5,  6,  7,  8, 9, 
          11, 12, 13, 14, 15,
        16, 17, 18, 19, 20, 
          22, 23, 24, 25, 26,
        27, 28, 29, 30, 31, 
          33, 34, 35, 36, 37,
        38, 39, 40, 41, 42, 
          44, 45, 46, 47, 48,
        49, 50, 51, 52, 53
}  
Hi Gerard,

Yes of course, we (and Bert) discussed this layout on this forum before. I only could make my modified layout because I learned from the example that you gave above. [img]images/smilies/icon_smile.gif[/img]

For what purposes do you use the other 11 bits in your layout? I do not have any information in these squares.

Rein

PS For those who are interested in games with hexagonal boards, I think bitboards can also be used there!
Ed Gilbert
Posts: 860
Joined: Sat Apr 28, 2007 14:53
Real name: Ed Gilbert
Location: Morristown, NJ USA
Contact:

Post by Ed Gilbert »

Rein,

What is the advantage of your representation over the one shown by Gerard? It seems they both eliminate explicit edge detection for man moves and man captures.

-- Ed
Rein Halbersma
Posts: 1722
Joined: Wed Apr 14, 2004 16:04
Contact:

Post by Rein Halbersma »

Ed Gilbert wrote:Rein,

What is the advantage of your representation over the one shown by Gerard? It seems they both eliminate explicit edge detection for man moves and man captures.

-- Ed
There are 2 advantages I can think of:
1) you can detect all horizontal configurations in parallel without edge detection. E.g. suppose you want to give points in your eval() for every pair of adjacent pieces like W27,28. Then with the usual layout you have to be careful as not to count e.g. W35,36 as such a pair because these squares are in adjacent bits (37 and 38). With my new layout they are in bits 43 and 45 with a ghost square on bit 44 in between.
2) you can now also have orthogonal captures. This is not relevant for international 10x10 draughts. But for Frisian draughts, which is played on the same 10x10 board, orthogonal captures are allowed. In the position W:WK2,B3 white can capture 2x4 and in B:WK2,B3 black can capture 3x1. But when pieces are close to the edge, you have to make sure that captures don't wrap around to the opposite edge of the next row. My layout adds another column of bit squares on both sides of the board to take care of this.

Anyway, if you are not interested in Frisian draughts (but which is a very beautiful game and 2 kings against 1 win already, so very few draws!) you only have the advantage of having somewhat easier eval patterns.
BertTuyt
Posts: 1592
Joined: Wed Sep 01, 2004 19:42

Post by BertTuyt »

Rein, other then the needs for other 10x10 variations, I don't really see the advantage of adding additional ghost squares. I browsed through me evaluation function, but nowhere I could find an advantage for this addition.

When i started , long ago with Damage, i started counting from 46-50, 36-40, so the other way around, and implemented this scheme in my bitboard.

If i have time, i still want to change this, it will not have any impact on my program, speed or whatever, but i like "internal Beauty". In case I will redo work, I will most likely start with the same schedule as Gerard uses , so square 1 is bitposition 0.

For an unknown reason i like standardization, so we are able to share more and to exchange more.

If someone is interested in my movegenerator ( in c++), feel free to ask, the only condition for sharing is that I get also the same from the other side.

With kind regards,

Bert
Post Reply