handling db blocks

Discussion about development of draughts in the time of computer and Internet.
Post Reply
TAILLE
Posts: 968
Joined: Thu Apr 26, 2007 18:51
Location: FRANCE

handling db blocks

Post by TAILLE » Thu Jul 23, 2009 11:21

Hi Ed,

I have a question concerning the handling of db blocks.
Let's suppose that the 8 pc db need about 400Gb of disk space. With about 4kb by block that means that we have to handle about 100M block of data.
For the 7pc db I needed 96bits per block to describe where is the block on the disk, where is the block in memory and which positions are found in that block. All these information are kept in memory.
For the 8pc db I intend to implement 128bits per block to describe the block. With 100M blocks that represents 1,6Gb.
It seems possible for me to continue to have all such information in memory.
Can you tell us what was your approach ?
Gérard

Ed Gilbert
Posts: 859
Joined: Sat Apr 28, 2007 14:53
Real name: Ed Gilbert
Location: Morristown, NJ USA
Contact:

Post by Ed Gilbert » Sat Jul 25, 2009 14:20

Gerard,

I use only 32 bits per block, so it is not so much overhead. You need to separate the information that is needed for every database block from that which is only needed for the blocks that are in your cache buffers, as your number of cache buffers is obviously a much smaller number.

-- Ed

64_bit_checkers_engine
Posts: 62
Joined: Mon Apr 20, 2009 01:10
Contact:

Re: handling db blocks

Post by 64_bit_checkers_engine » Sun Aug 02, 2009 17:25

TAILLE wrote:Hi Ed,


Let's suppose that the 8 pc db need about 400Gb of disk space. With about 4kb by block that means that we have to handle about 100M block of data.

For the 7pc db I needed 96bits per block to describe where is the block on the disk, where is the block in memory and which positions are found in that block. All these information are kept in memory.

For the 8pc db I intend to implement 128bits per block to describe the block. With 100M blocks that represents 1,6Gb.
The number of bits you need should not change as the number of pieces increase. Your database should not store the LOCATION of the pieces that make up a position. Your database should be just an array of INDEXED positions, one after another, from "entry 0" to "entry N", where N is the "total number of positions -1".

That way, you only need as many bits as the largest number to handle the "N/BLOCK SIZE".

If you have only 100 million blocks, you don't even need 32 bits per block, but this makes it easier if you ever need to go up to 4 billion blocks.

64_bit_checkers_engine
Posts: 62
Joined: Mon Apr 20, 2009 01:10
Contact:

Post by 64_bit_checkers_engine » Sun Aug 02, 2009 17:33

(see next post)
Last edited by 64_bit_checkers_engine on Sun Aug 02, 2009 17:38, edited 1 time in total.

64_bit_checkers_engine
Posts: 62
Joined: Mon Apr 20, 2009 01:10
Contact:

Post by 64_bit_checkers_engine » Sun Aug 02, 2009 17:37

I think your concern is, how to find a block in your most-recently-seen buffer. Here is how I do it for my uncompressed databases, 10-bits per position, with distance-to-win probed in RAM during the search:

Code: Select all

total blocks (g_buffer_node_count) defined for current db solving run = 0

DB 2 slice # 1 requires 1 BLOCKS.
running count of blocks BEFORE this slice = 0

DB 2 slice # 2 requires 1 BLOCKS.
running count of blocks BEFORE this slice = 1

DB 2 slice # 3 requires 1 BLOCKS.
running count of blocks BEFORE this slice = 2

DB 2 slice # 4 requires 1 BLOCKS.
running count of blocks BEFORE this slice = 3

DB 3 slice # 1 requires 5 BLOCKS.
running count of blocks BEFORE this slice = 4

DB 3 slice # 2 requires 5 BLOCKS.
running count of blocks BEFORE this slice = 9

DB 3 slice # 3 requires 8 BLOCKS.
running count of blocks BEFORE this slice = 14

DB 3 slice # 4 requires 8 BLOCKS.
running count of blocks BEFORE this slice = 22

DB 3 slice # 5 requires 4 BLOCKS.
running count of blocks BEFORE this slice = 30

DB 3 slice # 6 requires 4 BLOCKS.
running count of blocks BEFORE this slice = 34

DB 3 slice # 7 requires 4 BLOCKS.
running count of blocks BEFORE this slice = 38

DB 3 slice # 8 requires 4 BLOCKS.
running count of blocks BEFORE this slice = 42

DB 3 slice # 9 requires 7 BLOCKS.
running count of blocks BEFORE this slice = 46

DB 3 slice # 10 requires 7 BLOCKS.
running count of blocks BEFORE this slice = 53

DB 3 slice # 11 requires 4 BLOCKS.
running count of blocks BEFORE this slice = 60

DB 3 slice # 12 requires 4 BLOCKS.
running count of blocks BEFORE this slice = 64

DB 4 slice # 1 requires 66 BLOCKS.
running count of blocks BEFORE this slice = 68

DB 4 slice # 2 requires 116 BLOCKS.
running count of blocks BEFORE this slice = 134

DB 4 slice # 3 requires 51 BLOCKS.
running count of blocks BEFORE this slice = 250

DB 4 slice # 4 requires 116 BLOCKS.
running count of blocks BEFORE this slice = 301

DB 4 slice # 5 requires 202 BLOCKS.
running count of blocks BEFORE this slice = 417

DB 4 slice # 6 requires 88 BLOCKS.
running count of blocks BEFORE this slice = 619

DB 4 slice # 7 requires 51 BLOCKS.
running count of blocks BEFORE this slice = 707

DB 4 slice # 8 requires 88 BLOCKS.
running count of blocks BEFORE this slice = 758

DB 4 slice # 9 requires 39 BLOCKS.
running count of blocks BEFORE this slice = 846

DB 4 slice # 10 requires 44 BLOCKS.
running count of blocks BEFORE this slice = 885

DB 4 slice # 11 requires 44 BLOCKS.
running count of blocks BEFORE this slice = 929

DB 4 slice # 12 requires 116 BLOCKS.
running count of blocks BEFORE this slice = 973

DB 4 slice # 13 requires 116 BLOCKS.
running count of blocks BEFORE this slice = 1089

DB 4 slice # 14 requires 101 BLOCKS.
running count of blocks BEFORE this slice = 1205

DB 4 slice # 15 requires 101 BLOCKS.
running count of blocks BEFORE this slice = 1306

DB 4 slice # 16 requires 30 BLOCKS.
running count of blocks BEFORE this slice = 1407

DB 4 slice # 17 requires 30 BLOCKS.
running count of blocks BEFORE this slice = 1437

DB 4 slice # 18 requires 39 BLOCKS.
running count of blocks BEFORE this slice = 1467

DB 4 slice # 19 requires 39 BLOCKS.
running count of blocks BEFORE this slice = 1506

DB 4 slice # 20 requires 101 BLOCKS.
running count of blocks BEFORE this slice = 1545

DB 4 slice # 21 requires 101 BLOCKS.
running count of blocks BEFORE this slice = 1646

DB 4 slice # 22 requires 88 BLOCKS.
running count of blocks BEFORE this slice = 1747

DB 4 slice # 23 requires 88 BLOCKS.
running count of blocks BEFORE this slice = 1835

DB 4 slice # 24 requires 26 BLOCKS.
running count of blocks BEFORE this slice = 1923

DB 4 slice # 25 requires 26 BLOCKS.
running count of blocks BEFORE this slice = 1949

DB 5 slice # 1 requires 615 BLOCKS.
running count of blocks BEFORE this slice = 1975

DB 5 slice # 2 requires 615 BLOCKS.
running count of blocks BEFORE this slice = 2590

DB 5 slice # 3 requires 1076 BLOCKS.
running count of blocks BEFORE this slice = 3205

[LARGE SECTION CUT OUT FOR BREVITY]

DB 10 slice # 208 requires 675338 BLOCKS.
running count of blocks BEFORE this slice = 2967167033

DB 10 slice # 209 requires 4733795 BLOCKS.
running count of blocks BEFORE this slice = 2967842371

DB 10 slice # 210 requires 4733795 BLOCKS.
running count of blocks BEFORE this slice = 2972576166

DB 10 slice # 211 requires 14451361 BLOCKS.
running count of blocks BEFORE this slice = 2977309961

DB 10 slice # 212 requires 14451361 BLOCKS.
running count of blocks BEFORE this slice = 2991761322

DB 10 slice # 213 requires 25087809 BLOCKS.
running count of blocks BEFORE this slice = 3006212683

DB 10 slice # 214 requires 25087809 BLOCKS.
running count of blocks BEFORE this slice = 3031300492

DB 10 slice # 215 requires 27079078 BLOCKS.
running count of blocks BEFORE this slice = 3056388301

DB 10 slice # 216 requires 27079078 BLOCKS.
running count of blocks BEFORE this slice = 3083467379

DB 10 slice # 217 requires 18601524 BLOCKS.
running count of blocks BEFORE this slice = 3110546457

DB 10 slice # 218 requires 18601524 BLOCKS.
running count of blocks BEFORE this slice = 3129147981

DB 10 slice # 219 requires 7938023 BLOCKS.
running count of blocks BEFORE this slice = 3147749505

DB 10 slice # 220 requires 7938023 BLOCKS.
running count of blocks BEFORE this slice = 3155687528

DB 10 slice # 221 requires 1923041 BLOCKS.
running count of blocks BEFORE this slice = 3163625551

DB 10 slice # 222 requires 1923041 BLOCKS.
running count of blocks BEFORE this slice = 3165548592

DB 10 slice # 223 requires 202370 BLOCKS.
running count of blocks BEFORE this slice = 3167471633

DB 10 slice # 224 requires 202370 BLOCKS.
running count of blocks BEFORE this slice = 3167674003

DB 10 slice # 225 requires 196924 BLOCKS.
running count of blocks BEFORE this slice = 3167876373

DB 10 slice # 226 requires 196924 BLOCKS.
running count of blocks BEFORE this slice = 3168073297

DB 10 slice # 227 requires 1550776 BLOCKS.
running count of blocks BEFORE this slice = 3168270221

DB 10 slice # 228 requires 1550776 BLOCKS.
running count of blocks BEFORE this slice = 3169820997

DB 10 slice # 229 requires 5402701 BLOCKS.
running count of blocks BEFORE this slice = 3171371773

DB 10 slice # 230 requires 5402701 BLOCKS.
running count of blocks BEFORE this slice = 3176774474

DB 10 slice # 231 requires 10925461 BLOCKS.
running count of blocks BEFORE this slice = 3182177175

DB 10 slice # 232 requires 10925461 BLOCKS.
running count of blocks BEFORE this slice = 3193102636

DB 10 slice # 233 requires 14127751 BLOCKS.
running count of blocks BEFORE this slice = 3204028097

DB 10 slice # 234 requires 14127751 BLOCKS.
running count of blocks BEFORE this slice = 3218155848

DB 10 slice # 235 requires 12109501 BLOCKS.
running count of blocks BEFORE this slice = 3232283599

DB 10 slice # 236 requires 12109501 BLOCKS.
running count of blocks BEFORE this slice = 3244393100

DB 10 slice # 237 requires 6877001 BLOCKS.
running count of blocks BEFORE this slice = 3256502601

DB 10 slice # 238 requires 6877001 BLOCKS.
running count of blocks BEFORE this slice = 3263379602

DB 10 slice # 239 requires 2493858 BLOCKS.
running count of blocks BEFORE this slice = 3270256603

DB 10 slice # 240 requires 2493858 BLOCKS.
running count of blocks BEFORE this slice = 3272750461

DB 10 slice # 241 requires 523711 BLOCKS.
running count of blocks BEFORE this slice = 3275244319

DB 10 slice # 242 requires 523711 BLOCKS.
running count of blocks BEFORE this slice = 3275768030

DB 10 slice # 243 requires 48492 BLOCKS.
running count of blocks BEFORE this slice = 3276291741

DB 10 slice # 244 requires 48492 BLOCKS.
running count of blocks BEFORE this slice = 3276340233

DB 10 slice # 245 requires 172309 BLOCKS.
running count of blocks BEFORE this slice = 3276388725

DB 10 slice # 246 requires 172309 BLOCKS.
running count of blocks BEFORE this slice = 3276561034

DB 10 slice # 247 requires 1357822 BLOCKS.
running count of blocks BEFORE this slice = 3276733343

DB 10 slice # 248 requires 1357822 BLOCKS.
running count of blocks BEFORE this slice = 3278091165

DB 10 slice # 249 requires 4733795 BLOCKS.
running count of blocks BEFORE this slice = 3279448987

DB 10 slice # 250 requires 4733795 BLOCKS.
running count of blocks BEFORE this slice = 3284182782

DB 10 slice # 251 requires 9579961 BLOCKS.
running count of blocks BEFORE this slice = 3288916577

DB 10 slice # 252 requires 9579961 BLOCKS.
running count of blocks BEFORE this slice = 3298496538

DB 10 slice # 253 requires 12397822 BLOCKS.
running count of blocks BEFORE this slice = 3308076499

DB 10 slice # 254 requires 12397822 BLOCKS.
running count of blocks BEFORE this slice = 3320474321

DB 10 slice # 255 requires 10635858 BLOCKS.
running count of blocks BEFORE this slice = 3332872143

DB 10 slice # 256 requires 10635858 BLOCKS.
running count of blocks BEFORE this slice = 3343508001

DB 10 slice # 257 requires 6045715 BLOCKS.
running count of blocks BEFORE this slice = 3354143859

DB 10 slice # 258 requires 6045715 BLOCKS.
running count of blocks BEFORE this slice = 3360189574

DB 10 slice # 259 requires 2194595 BLOCKS.
running count of blocks BEFORE this slice = 3366235289

DB 10 slice # 260 requires 2194595 BLOCKS.
running count of blocks BEFORE this slice = 3368429884

DB 10 slice # 261 requires 461364 BLOCKS.
running count of blocks BEFORE this slice = 3370624479

DB 10 slice # 262 requires 461364 BLOCKS.
running count of blocks BEFORE this slice = 3371085843

DB 10 slice # 263 requires 42770 BLOCKS.
running count of blocks BEFORE this slice = 3371547207

DB 10 slice # 264 requires 42770 BLOCKS.
running count of blocks BEFORE this slice = 3371589977

Each "slice" is given a unique global number, from 1 to 400. I look up that number, and I have a "starting block" ID nuumber for that slice. I compute the index, divide by the block size, add it to the "starting block", and that must be my global block ID.

I then can determine if that block ID is already in the buffer or not by using another linked list I maintain.

64_bit_checkers_engine
Posts: 62
Joined: Mon Apr 20, 2009 01:10
Contact:

Buffer diagrams

Post by 64_bit_checkers_engine » Sun Aug 02, 2009 18:50

Another thing I did to help me understand how my buffer was performing, and how I might optimize it, was to create a utility for drawing the contents of my buffer. A snapshot from solving the 5-piece slice:

Code: Select all

<----------TOP--------->                                                                                                    
-----------------------------------------------------------------------------------------------------------------------------
|       [0000196]       ||       [0000195]       ||       [0000026]       ||       [0000194]       ||       [0000193]       |
|                       ||                       ||                       ||                       ||                       |
|XXXXXXX<-u   d->0000195||0000196<-u   d->0000026||0000195<-u   d->0000194||0000026<-u   d->0000193||0000194<-u   d->0000025|
|                       ||                       ||                       ||                       ||                       |
| DB 05       SLICE 001 || DB 05       SLICE 001 || DB 05       SLICE 001 || DB 05       SLICE 001 || DB 05       SLICE 001 |
|                       ||                       ||                       ||                       ||                       |
|   BLOCK # 000000196   ||   BLOCK # 000000195   ||   BLOCK # 000000178   ||   BLOCK # 000000194   ||   BLOCK # 000000177   |
|                       ||                       ||                       ||                       ||                       |
|  W015 W015 W005 W019  ||  W025 ???? ???? ????  ||  ???? ???? ???? W005  ||  W015 W019 W023 W011  ||  ???? ???? ???? ????  |
|  W009 W015 W007 W015  ||  ???? ???? ???? ????  ||  W021 ???? ???? ????  ||  W019 W019 W009 W023  ||  ???? ???? ???? ????  |
|  W015 W009 W009 W001  ||  ???? ???? ???? ????  ||  ???? ???? ???? ????  ||  ???? ???? ???? ????  ||  ???? ???? ???? ????  |
|                       ||                       ||                       ||                       ||                       |
-----------------------------------------------------------------------------------------------------------------------------
                                                                                                                             

                                                                                                                             
-----------------------------------------------------------------------------------------------------------------------------
|       [0000025]       ||       [0000192]       ||       [0000191]       ||       [0000190]       ||       [0000024]       |
|                       ||                       ||                       ||                       ||                       |
|0000193<-u   d->0000192||0000025<-u   d->0000191||0000192<-u   d->0000190||0000191<-u   d->0000024||0000190<-u   d->0000189|
|                       ||                       ||                       ||                       ||                       |
| DB 05       SLICE 001 || DB 05       SLICE 001 || DB 05       SLICE 001 || DB 05       SLICE 001 || DB 05       SLICE 001 |
|                       ||                       ||                       ||                       ||                       |
|   BLOCK # 000000161   ||   BLOCK # 000000193   ||   BLOCK # 000000176   ||   BLOCK # 000000160   ||   BLOCK # 000000145   |
|                       ||                       ||                       ||                       ||                       |
|  ???? ???? ???? ????  ||  W005 W019 W001 ????  ||  ???? ???? ???? ????  ||  ???? ???? ???? ????  ||  ???? ???? ???? ????  |
|  ???? ???? ???? ????  ||  ???? ???? ???? ????  ||  ???? ???? ???? ????  ||  ???? ???? ???? ????  ||  ???? ???? ???? ????  |
|  ???? ???? ???? ????  ||  ???? ???? ???? ????  ||  ???? ???? W003 W003  ||  ???? ???? ???? ????  ||  ???? ???? ???? ????  |
|                       ||                       ||                       ||                       ||                       |
-----------------------------------------------------------------------------------------------------------------------------
                                                                                                                             

                                                                                                                             
-----------------------------------------------------------------------------------------------------------------------------
|       [0000189]       ||       [0000188]       ||       [0000187]       ||       [0000183]       ||       [0000186]       |
|                       ||                       ||                       ||                       ||                       |
|0000024<-u   d->0000188||0000189<-u   d->0000187||0000188<-u   d->0000183||0000187<-u   d->0000186||0000183<-u   d->0000182|
|                       ||                       ||                       ||                       ||                       |
| DB 05       SLICE 001 || DB 05       SLICE 001 || DB 05       SLICE 001 || DB 05       SLICE 001 || DB 05       SLICE 001 |
|                       ||                       ||                       ||                       ||                       |
|   BLOCK # 000000192   ||   BLOCK # 000000175   ||   BLOCK # 000000159   ||   BLOCK # 000000158   ||   BLOCK # 000000144   |
|                       ||                       ||                       ||                       ||                       |
|  ???? ???? ???? ????  ||  ???? ???? ???? ????  ||  ???? ???? ???? ????  ||  W011 W011 W011 W011  ||  ???? ???? ???? ????  |
|  ???? ???? W013 ????  ||  ???? ???? ???? ????  ||  ???? ???? ???? ????  ||  W013 W011 W011 W007  ||  ???? ???? ???? ????  |
|  ???? ???? ???? ????  ||  ???? W007 W007 ????  ||  ???? ???? ???? W007  ||  W011 W011 W011 W001  ||  ???? ???? ???? W033  |
|                       ||                       ||                       ||                       ||                       |
-----------------------------------------------------------------------------------------------------------------------------
                                                                                                                             


-----------------------------------------------------------------------------------------------------------------------------
|       [0000004]       ||       [0000029]       ||       [0000028]       ||       [0000027]       ||       [0000003]       |
|                       ||                       ||                       ||                       ||                       |
|0000030<-u   d->0000029||0000004<-u   d->0000028||0000029<-u   d->0000027||0000028<-u   d->0000003||0000027<-u   d->0000012|
|                       ||                       ||                       ||                       ||                       |
| DB 05       SLICE 001 || DB 05       SLICE 001 || DB 05       SLICE 001 || DB 05       SLICE 001 || DB 05       SLICE 001 |
|                       ||                       ||                       ||                       ||                       |
|   BLOCK # 000000004   ||   BLOCK # 000000053   ||   BLOCK # 000000009   ||   BLOCK # 000000005   ||   BLOCK # 000000003   |
|                       ||                       ||                       ||                       ||                       |
|  W011 W015 W017 W011  ||  W053 W053 W029 W049  ||  W049 W021 W055 W055  ||  W057 W057 W019 W019  ||  W047 W047 W047 W045  |
|  W013 W013 W007 W011  ||  W051 W041 W053 W019  ||  W021 W055 W055 W053  ||  W051 W051 W031 W057  ||  W029 W035 W029 W023  |
|  W015 W039 W013 W039  ||  W017 W017 W051 W051  ||  W057 W057 W057 W021  ||  W057 W019 W015 W051  ||  W037 W009 W017 W029  |
|                       ||                       ||                       ||                       ||                       |
-----------------------------------------------------------------------------------------------------------------------------
                                                                                                                             

                                                                                                                             
-----------------------------------------------------------------------------------------------------------------------------
|       [0000012]       ||       [0000002]       ||       [0000001]       ||       [0000000]       ||       [NO DATA]       |
|                       ||                       ||                       ||                       ||                       |
|0000003<-u   d->0000002||0000012<-u   d->0000001||0000002<-u   d->0000000||0000001<-u   d->XXXXXXX||-------<-u   d->-------|
|                       ||                       ||                       ||                       ||                       |
| DB 05       SLICE 001 || DB 05       SLICE 001 || DB 05       SLICE 001 || DB 05       SLICE 001 ||                       |
|                       ||                       ||                       ||                       ||                       |
|   BLOCK # 000000027   ||   BLOCK # 000000002   ||   BLOCK # 000000001   ||   BLOCK # 000000000   ||                       |
|                       ||                       ||                       ||                       ||                       |
|  W035 W017 W013 W011  ||  W047 W027 DRAW W023  ||  W003 W003 W043 W027  ||  W005 W003 W007 W003  ||                       |
|  W035 W029 W045 W045  ||  W045 W015 W035 W007  ||  W049 W061 W013 W049  ||  W007 W011 W003 W011  ||                       |
|  W025 W007 W031 W031  ||  W007 W019 W049 W051  ||  W001 W017 W029 W049  ||  W007 W001 W015 W019  ||                       |
|                       ||                       ||                       ||                       ||                       |
-----------------------------------------------------------------------------------------------------------------------------
                                                                           <--------BOTTOM-------->                         

"XXXXXXX<-u d->0000195" means there is nothing being pointed to "up", and block number 195 is being pointed to "down"

"W015 W015 W005 W019" is a look into the first few values written into that block. Win in 15 plies, Win in 15 pies, Win in 5 plies, Win in 19 plies, etc.

Likewise "0000196<-u d->0000026" means it points "up" to block number 196, and down to block number 26.

Using this layout, I was able to change my indexing functions to make them "more localized", meaning the block would get reused as much as possible before being paged to disk.

My buffer efficiency for very large database operations, computed by "block reads in RAM divided by RAM plus disk reads", was 99.999998% and higher. This is almost as good as being able to compute everything in RAM.

Ed Gilbert
Posts: 859
Joined: Sat Apr 28, 2007 14:53
Real name: Ed Gilbert
Location: Morristown, NJ USA
Contact:

Post by Ed Gilbert » Mon Aug 03, 2009 01:52

I use only 32 bits per block, so it is not so much overhead.
Gerard, I was wrong when I wrote this. I forgot that there is also another 32-bit integer for each 4K db block that indicates which cache block it is loaded into, or a special value if not in cache. So for every 4K region of the whole db there are two 32-bit numbers in memory, 8 bytes total. Sorry for my poor memory [img]images/smilies/icon_sad.gif[/img]

-- Ed

TAILLE
Posts: 968
Joined: Thu Apr 26, 2007 18:51
Location: FRANCE

Post by TAILLE » Mon Aug 03, 2009 10:30

Hi Ed.
Ed Gilbert wrote:
I use only 32 bits per block, so it is not so much overhead.
Gerard, I was wrong when I wrote this. I forgot that there is also another 32-bit integer for each 4K db block that indicates which cache block it is loaded into, or a special value if not in cache. So for every 4K region of the whole db there are two 32-bit numbers in memory, 8 bytes total. Sorry for my poor memory [img]images/smilies/icon_sad.gif[/img]

-- Ed
That looks more understandable. So, for each block you have the starting index and the adress in memory (or block number). Thank you Ed. In the new approach I have in mind (blocks with a fixe number of positions, i.e. around 200K positions) I will also use 64 bits per block in order to store the adress on the disk and the adress in memory.
The difference is very clear : you have the starting index where I have the adress on the disk.
Gérard

Post Reply