EndGame Databases

64_bit_checkers_engine · Sat Aug 01, 2009 22:47

Rein Halbersma wrote:
TAILLE wrote:
Rein Halbersma wrote:For the larger databases (>6 pieces), they have for every database subdivision (wm, wk, bm, wk, wr, br tuple) a plain text index file with for every 4K block of data in the database the 32-bit index of the position that starts the block. Suppose you build the full 8 pc databases and ultimately have a 512 Gb database on disk. That means 128M blocks of indices, 4 bytes per starting index so about 512 Mb of indexing data to be in RAM at all times.
How do you manage to have only 4 bytes for the starting index in the case of the 8 pieces db ? Each slices of this db have far more than 4G positions.
This is how Ed describes it:
http://pages.prodigy.net/eyg/Checkers/10-pieceBuild.htm

"subdivide each slice into fixed size subdivisions of size 2^32. To find which subdivision of a slice a position is in, the 64-bit slice index is computed for the position, and this number is divided by 2^32. The integer quotient identifies which subdivision contains the index, and the 32-bit remainder from the division becomes the index of the position within that subdivision."

So you use 64-bit indices during building, but 32-bit indices in the index file.

Actually, that is how I described it to Ed Glibert, and my name is Ed too, so your comment is correct

[img]images/smilies/icon_smile.gif[/img]

However, Ed Gilbert eventually used 2^31 instead of 2^32 for reasons he mentioned later.

BertTuyt · Post by **BertTuyt** » Tue Aug 04, 2009 22:04

I have a specific issue with my attempt to add a database cache in my program, in stead of reading all files at the start.

What I now do is to open ( fopen() )all sub-databases (so including all sliced sun-versions) at program start, and store all pointers in an array, for the 6p databases these are 3195 files !!

In a later stage I want to read a specific datablock (4KByte) on demand,
by moving the pointer with the fseek(), and then do a fread().

Im now confronted with a limitation of opened files in Windows.
Apparently this limit is 512 and can be set to 2048, but thats the upper maximum.

Im curious if other programmers also open "zillions" of files, and have developed bypasses to deal with this.

Note: A bypass, is to open the file every time i want to read "another datablock, but im afraid of the time penalty involved.

Bert

TAILLE · Post by **TAILLE** » Wed Aug 05, 2009 00:28

BertTuyt wrote:I have a specific issue with my attempt to add a database cache in my program, in stead of reading all files at the start.

What I now do is to open ( fopen() )all sub-databases (so including all sliced sun-versions) at program start, and store all pointers in an array, for the 6p databases these are 3195 files !!

In a later stage I want to read a specific datablock (4KByte) on demand,
by moving the pointer with the fseek(), and then do a fread().

Im now confronted with a limitation of opened files in Windows.
Apparently this limit is 512 and can be set to 2048, but thats the upper maximum.

Im curious if other programmers also open "zillions" of files, and have developed bypasses to deal with this.

Note: A bypass, is to open the file every time i want to read "another datablock, but im afraid of the time penalty involved.

Bert

I have not this problem with Damy. For the 6p db I have only 2 files, one if it white to play and one if is black to play.
For the 7p db I built 76 files (2 files per slice). With the same approach I intend to add 78 files for the 8p db.

Ed Gilbert · Post by **Ed Gilbert** » Wed Aug 05, 2009 02:34

Bert, when building databases I open a file when a lookup is needed and then keep the open file handle around for a maximum of a few days. If it is not used after a few days I close it. I don't know how many files I have open simultaneously but I'm sure that at times it is at least a few thousand. I do not use fopen. I use CreateFile, SetFilePointer, ReadFile, and CloseHandle.

-- Ed

Ed Gilbert · Post by **Ed Gilbert** » Wed Aug 05, 2009 13:07

Bert, the program I use to build dbs for 8x8 checkers leaves the files open for a longer period of time, and there I have had ~30k file handles open. Also, the "few thousand" figure I mentioned is for each instance of the builder. On the 8-core machine there were 8 instances running, each with a few thousand open file handles.

-- Ed

64_bit_checkers_engine · Thu Aug 06, 2009 01:19

BertTuyt wrote: Im now confronted with a limitation of opened files in Windows.
Apparently this limit is 512 and can be set to 2048, but thats the upper maximum.

Bert

This sounds familiar. I only use a maximum of 400 files for all 10-piece database slices in 8x8 checkers, but I encountered it elsewhere on another project I was working on at the time.

I think the "problem" is the "Windows disk cache" setting. There is something, somewhere, under one of those endless "tabs" on a "properties" window you get by right-clicking on your hard drive icon.

Find it. And then turn off windows' disk cache.

64_bit_checkers_engine · Thu Aug 06, 2009 01:22

BertTuyt wrote: What I now do is to open ( fopen() )

I also remember reading that "fopen()" is now one of those "old-time" function calls, and it has been made "obsolete" but still able to be accessed.

I forget what the name of the new routine is, I think fopen64(). Same goes for fclose() and fseek(), etc.

BertTuyt · Post by **BertTuyt** » Sat Aug 08, 2009 16:00

Ed,

I was also looking at the CreateFile functions, but i wanted to be sure, that there was not another alternative.
In the mean time im re-writing my program , and replacing the traditional fopen/fread/fwrite/flcose.

I have still an open question:

* Is it possible to address large files with thes functions ( so > 2 Gigabyte). For now the files which I have are smaller, but if I want to merge all the sliced sub-databases (most advanced rank of white/black man), into a single file (don't even know if this is really needed), i might go beyond the 4 GigaByte for a single file.

Did you also encounter this problem/issues, and if so what was your approach?

Bert

Ed Gilbert · Post by **Ed Gilbert** » Sat Aug 08, 2009 18:30

Is it possible to address large files with thes functions ( so > 2 Gigabyte).

Yes, certainly. Many of my db files are much larger than 2gb. For example, the file for 2bm 3bk 1wm 2wk is 44 gb. SetFilePointer takes a 64-bit file offset.

-- Ed

Rein Halbersma · Post by **Rein Halbersma** » Fri Oct 09, 2009 21:27

Code: Select all

{
           0,  1,  2,  3,  4,
         6,  7,  8,  9, 10, 
          13, 14, 15, 16, 17,
        19, 20, 21, 22, 23, 
          26, 27, 28, 29, 30,
        32, 33, 34, 35, 36, 
          39, 40, 41, 42, 43,
        45, 46, 47, 48, 49, 
          52, 53, 54, 55, 56,
        58, 59, 60, 61, 62
}

I have been redesigning my engine-to-be. Part of my design goal is to have a generic move generator / board layout (using C++ templates) for all the various draughts variants played around the world.

The above bitboard layout enables orthogonal captures for Frisian draughts. For those who know the ghost squares trick it's fairly obvious. For those who don't: not displayed are 2 columns at each side of the board (i.e. squares 5, 11, 12, 18, 24, 25, 37, 38, 44, 50, 51, 57). They eliminate edge detection and even/odd row distinction during move generation/evaluation.

It's just so nice that you have 1 bit to spare with 64-bit integers! It might also be handy for international draughts as you can now more easily detect horizontal patterns without the need for explicit edge detection.

TAILLE · Post by **TAILLE** » Sat Oct 10, 2009 00:03

[quote="Rein Halbersma"]

Code: Select all

{
           0,  1,  2,  3,  4,
         6,  7,  8,  9, 10, 
          13, 14, 15, 16, 17,
        19, 20, 21, 22, 23, 
          26, 27, 28, 29, 30,
        32, 33, 34, 35, 36, 
          39, 40, 41, 42, 43,
        45, 46, 47, 48, 49, 
          52, 53, 54, 55, 56,
        58, 59, 60, 61, 62
}

For your information my coding is the following, with 11 spare bits I use for other purpose.

Code: Select all

{
           0,  1,  2,  3,  4,
         5,  6,  7,  8, 9, 
          11, 12, 13, 14, 15,
        16, 17, 18, 19, 20, 
          22, 23, 24, 25, 26,
        27, 28, 29, 30, 31, 
          33, 34, 35, 36, 37,
        38, 39, 40, 41, 42, 
          44, 45, 46, 47, 48,
        49, 50, 51, 52, 53
}

Rein Halbersma · Post by **Rein Halbersma** » Sat Oct 10, 2009 01:21

TAILLE wrote: For your information my coding is the following, with 11 spare bits I use for other purpose.

Code: Select all

{
           0,  1,  2,  3,  4,
         5,  6,  7,  8, 9, 
          11, 12, 13, 14, 15,
        16, 17, 18, 19, 20, 
          22, 23, 24, 25, 26,
        27, 28, 29, 30, 31, 
          33, 34, 35, 36, 37,
        38, 39, 40, 41, 42, 
          44, 45, 46, 47, 48,
        49, 50, 51, 52, 53
}

Hi Gerard,

Yes of course, we (and Bert) discussed this layout on this forum before. I only could make my modified layout because I learned from the example that you gave above. [img]images/smilies/icon_smile.gif[/img]

For what purposes do you use the other 11 bits in your layout? I do not have any information in these squares.

Rein

PS For those who are interested in games with hexagonal boards, I think bitboards can also be used there!

Ed Gilbert · Post by **Ed Gilbert** » Sat Oct 10, 2009 03:19

Rein,

What is the advantage of your representation over the one shown by Gerard? It seems they both eliminate explicit edge detection for man moves and man captures.

-- Ed

Rein Halbersma · Post by **Rein Halbersma** » Sat Oct 10, 2009 12:18

Ed Gilbert wrote:Rein,

What is the advantage of your representation over the one shown by Gerard? It seems they both eliminate explicit edge detection for man moves and man captures.

-- Ed

There are 2 advantages I can think of:
1) you can detect all horizontal configurations in parallel without edge detection. E.g. suppose you want to give points in your eval() for every pair of adjacent pieces like W27,28. Then with the usual layout you have to be careful as not to count e.g. W35,36 as such a pair because these squares are in adjacent bits (37 and 38). With my new layout they are in bits 43 and 45 with a ghost square on bit 44 in between.
2) you can now also have orthogonal captures. This is not relevant for international 10x10 draughts. But for Frisian draughts, which is played on the same 10x10 board, orthogonal captures are allowed. In the position W:WK2,B3 white can capture 2x4 and in B:WK2,B3 black can capture 3x1. But when pieces are close to the edge, you have to make sure that captures don't wrap around to the opposite edge of the next row. My layout adds another column of bit squares on both sides of the board to take care of this.

Anyway, if you are not interested in Frisian draughts (but which is a very beautiful game and 2 kings against 1 win already, so very few draws!) you only have the advantage of having somewhat easier eval patterns.

BertTuyt · Post by **BertTuyt** » Sat Oct 10, 2009 14:09

Rein, other then the needs for other 10x10 variations, I don't really see the advantage of adding additional ghost squares. I browsed through me evaluation function, but nowhere I could find an advantage for this addition.

When i started , long ago with Damage, i started counting from 46-50, 36-40, so the other way around, and implemented this scheme in my bitboard.

If i have time, i still want to change this, it will not have any impact on my program, speed or whatever, but i like "internal Beauty". In case I will redo work, I will most likely start with the same schedule as Gerard uses , so square 1 is bitposition 0.

For an unknown reason i like standardization, so we are able to share more and to exchange more.

If someone is interested in my movegenerator ( in c++), feel free to ask, the only condition for sharing is that I get also the same from the other side.

With kind regards,

Bert

World Draughts Forum

EndGame Databases

new bitboard layout

Re: new bitboard layout

Re: new bitboard layout