![](https://damforum.nl/bb3/images/ua.png)
handling db blocks
handling db blocks
Hi Ed,
I have a question concerning the handling of db blocks.
Let's suppose that the 8 pc db need about 400Gb of disk space. With about 4kb by block that means that we have to handle about 100M block of data.
For the 7pc db I needed 96bits per block to describe where is the block on the disk, where is the block in memory and which positions are found in that block. All these information are kept in memory.
For the 8pc db I intend to implement 128bits per block to describe the block. With 100M blocks that represents 1,6Gb.
It seems possible for me to continue to have all such information in memory.
Can you tell us what was your approach ?
I have a question concerning the handling of db blocks.
Let's suppose that the 8 pc db need about 400Gb of disk space. With about 4kb by block that means that we have to handle about 100M block of data.
For the 7pc db I needed 96bits per block to describe where is the block on the disk, where is the block in memory and which positions are found in that block. All these information are kept in memory.
For the 8pc db I intend to implement 128bits per block to describe the block. With 100M blocks that represents 1,6Gb.
It seems possible for me to continue to have all such information in memory.
Can you tell us what was your approach ?
Gérard
-
- Posts: 859
- Joined: Sat Apr 28, 2007 14:53
- Real name: Ed Gilbert
- Location: Morristown, NJ USA
- Contact:
-
- Posts: 62
- Joined: Mon Apr 20, 2009 01:10
- Contact:
Re: handling db blocks
The number of bits you need should not change as the number of pieces increase. Your database should not store the LOCATION of the pieces that make up a position. Your database should be just an array of INDEXED positions, one after another, from "entry 0" to "entry N", where N is the "total number of positions -1".TAILLE wrote:Hi Ed,
Let's suppose that the 8 pc db need about 400Gb of disk space. With about 4kb by block that means that we have to handle about 100M block of data.
For the 7pc db I needed 96bits per block to describe where is the block on the disk, where is the block in memory and which positions are found in that block. All these information are kept in memory.
For the 8pc db I intend to implement 128bits per block to describe the block. With 100M blocks that represents 1,6Gb.
That way, you only need as many bits as the largest number to handle the "N/BLOCK SIZE".
If you have only 100 million blocks, you don't even need 32 bits per block, but this makes it easier if you ever need to go up to 4 billion blocks.
-
- Posts: 62
- Joined: Mon Apr 20, 2009 01:10
- Contact:
(see next post)
Last edited by 64_bit_checkers_engine on Sun Aug 02, 2009 17:38, edited 1 time in total.
-
- Posts: 62
- Joined: Mon Apr 20, 2009 01:10
- Contact:
I think your concern is, how to find a block in your most-recently-seen buffer. Here is how I do it for my uncompressed databases, 10-bits per position, with distance-to-win probed in RAM during the search:
Each "slice" is given a unique global number, from 1 to 400. I look up that number, and I have a "starting block" ID nuumber for that slice. I compute the index, divide by the block size, add it to the "starting block", and that must be my global block ID.
I then can determine if that block ID is already in the buffer or not by using another linked list I maintain.
Code: Select all
total blocks (g_buffer_node_count) defined for current db solving run = 0
DB 2 slice # 1 requires 1 BLOCKS.
running count of blocks BEFORE this slice = 0
DB 2 slice # 2 requires 1 BLOCKS.
running count of blocks BEFORE this slice = 1
DB 2 slice # 3 requires 1 BLOCKS.
running count of blocks BEFORE this slice = 2
DB 2 slice # 4 requires 1 BLOCKS.
running count of blocks BEFORE this slice = 3
DB 3 slice # 1 requires 5 BLOCKS.
running count of blocks BEFORE this slice = 4
DB 3 slice # 2 requires 5 BLOCKS.
running count of blocks BEFORE this slice = 9
DB 3 slice # 3 requires 8 BLOCKS.
running count of blocks BEFORE this slice = 14
DB 3 slice # 4 requires 8 BLOCKS.
running count of blocks BEFORE this slice = 22
DB 3 slice # 5 requires 4 BLOCKS.
running count of blocks BEFORE this slice = 30
DB 3 slice # 6 requires 4 BLOCKS.
running count of blocks BEFORE this slice = 34
DB 3 slice # 7 requires 4 BLOCKS.
running count of blocks BEFORE this slice = 38
DB 3 slice # 8 requires 4 BLOCKS.
running count of blocks BEFORE this slice = 42
DB 3 slice # 9 requires 7 BLOCKS.
running count of blocks BEFORE this slice = 46
DB 3 slice # 10 requires 7 BLOCKS.
running count of blocks BEFORE this slice = 53
DB 3 slice # 11 requires 4 BLOCKS.
running count of blocks BEFORE this slice = 60
DB 3 slice # 12 requires 4 BLOCKS.
running count of blocks BEFORE this slice = 64
DB 4 slice # 1 requires 66 BLOCKS.
running count of blocks BEFORE this slice = 68
DB 4 slice # 2 requires 116 BLOCKS.
running count of blocks BEFORE this slice = 134
DB 4 slice # 3 requires 51 BLOCKS.
running count of blocks BEFORE this slice = 250
DB 4 slice # 4 requires 116 BLOCKS.
running count of blocks BEFORE this slice = 301
DB 4 slice # 5 requires 202 BLOCKS.
running count of blocks BEFORE this slice = 417
DB 4 slice # 6 requires 88 BLOCKS.
running count of blocks BEFORE this slice = 619
DB 4 slice # 7 requires 51 BLOCKS.
running count of blocks BEFORE this slice = 707
DB 4 slice # 8 requires 88 BLOCKS.
running count of blocks BEFORE this slice = 758
DB 4 slice # 9 requires 39 BLOCKS.
running count of blocks BEFORE this slice = 846
DB 4 slice # 10 requires 44 BLOCKS.
running count of blocks BEFORE this slice = 885
DB 4 slice # 11 requires 44 BLOCKS.
running count of blocks BEFORE this slice = 929
DB 4 slice # 12 requires 116 BLOCKS.
running count of blocks BEFORE this slice = 973
DB 4 slice # 13 requires 116 BLOCKS.
running count of blocks BEFORE this slice = 1089
DB 4 slice # 14 requires 101 BLOCKS.
running count of blocks BEFORE this slice = 1205
DB 4 slice # 15 requires 101 BLOCKS.
running count of blocks BEFORE this slice = 1306
DB 4 slice # 16 requires 30 BLOCKS.
running count of blocks BEFORE this slice = 1407
DB 4 slice # 17 requires 30 BLOCKS.
running count of blocks BEFORE this slice = 1437
DB 4 slice # 18 requires 39 BLOCKS.
running count of blocks BEFORE this slice = 1467
DB 4 slice # 19 requires 39 BLOCKS.
running count of blocks BEFORE this slice = 1506
DB 4 slice # 20 requires 101 BLOCKS.
running count of blocks BEFORE this slice = 1545
DB 4 slice # 21 requires 101 BLOCKS.
running count of blocks BEFORE this slice = 1646
DB 4 slice # 22 requires 88 BLOCKS.
running count of blocks BEFORE this slice = 1747
DB 4 slice # 23 requires 88 BLOCKS.
running count of blocks BEFORE this slice = 1835
DB 4 slice # 24 requires 26 BLOCKS.
running count of blocks BEFORE this slice = 1923
DB 4 slice # 25 requires 26 BLOCKS.
running count of blocks BEFORE this slice = 1949
DB 5 slice # 1 requires 615 BLOCKS.
running count of blocks BEFORE this slice = 1975
DB 5 slice # 2 requires 615 BLOCKS.
running count of blocks BEFORE this slice = 2590
DB 5 slice # 3 requires 1076 BLOCKS.
running count of blocks BEFORE this slice = 3205
[LARGE SECTION CUT OUT FOR BREVITY]
DB 10 slice # 208 requires 675338 BLOCKS.
running count of blocks BEFORE this slice = 2967167033
DB 10 slice # 209 requires 4733795 BLOCKS.
running count of blocks BEFORE this slice = 2967842371
DB 10 slice # 210 requires 4733795 BLOCKS.
running count of blocks BEFORE this slice = 2972576166
DB 10 slice # 211 requires 14451361 BLOCKS.
running count of blocks BEFORE this slice = 2977309961
DB 10 slice # 212 requires 14451361 BLOCKS.
running count of blocks BEFORE this slice = 2991761322
DB 10 slice # 213 requires 25087809 BLOCKS.
running count of blocks BEFORE this slice = 3006212683
DB 10 slice # 214 requires 25087809 BLOCKS.
running count of blocks BEFORE this slice = 3031300492
DB 10 slice # 215 requires 27079078 BLOCKS.
running count of blocks BEFORE this slice = 3056388301
DB 10 slice # 216 requires 27079078 BLOCKS.
running count of blocks BEFORE this slice = 3083467379
DB 10 slice # 217 requires 18601524 BLOCKS.
running count of blocks BEFORE this slice = 3110546457
DB 10 slice # 218 requires 18601524 BLOCKS.
running count of blocks BEFORE this slice = 3129147981
DB 10 slice # 219 requires 7938023 BLOCKS.
running count of blocks BEFORE this slice = 3147749505
DB 10 slice # 220 requires 7938023 BLOCKS.
running count of blocks BEFORE this slice = 3155687528
DB 10 slice # 221 requires 1923041 BLOCKS.
running count of blocks BEFORE this slice = 3163625551
DB 10 slice # 222 requires 1923041 BLOCKS.
running count of blocks BEFORE this slice = 3165548592
DB 10 slice # 223 requires 202370 BLOCKS.
running count of blocks BEFORE this slice = 3167471633
DB 10 slice # 224 requires 202370 BLOCKS.
running count of blocks BEFORE this slice = 3167674003
DB 10 slice # 225 requires 196924 BLOCKS.
running count of blocks BEFORE this slice = 3167876373
DB 10 slice # 226 requires 196924 BLOCKS.
running count of blocks BEFORE this slice = 3168073297
DB 10 slice # 227 requires 1550776 BLOCKS.
running count of blocks BEFORE this slice = 3168270221
DB 10 slice # 228 requires 1550776 BLOCKS.
running count of blocks BEFORE this slice = 3169820997
DB 10 slice # 229 requires 5402701 BLOCKS.
running count of blocks BEFORE this slice = 3171371773
DB 10 slice # 230 requires 5402701 BLOCKS.
running count of blocks BEFORE this slice = 3176774474
DB 10 slice # 231 requires 10925461 BLOCKS.
running count of blocks BEFORE this slice = 3182177175
DB 10 slice # 232 requires 10925461 BLOCKS.
running count of blocks BEFORE this slice = 3193102636
DB 10 slice # 233 requires 14127751 BLOCKS.
running count of blocks BEFORE this slice = 3204028097
DB 10 slice # 234 requires 14127751 BLOCKS.
running count of blocks BEFORE this slice = 3218155848
DB 10 slice # 235 requires 12109501 BLOCKS.
running count of blocks BEFORE this slice = 3232283599
DB 10 slice # 236 requires 12109501 BLOCKS.
running count of blocks BEFORE this slice = 3244393100
DB 10 slice # 237 requires 6877001 BLOCKS.
running count of blocks BEFORE this slice = 3256502601
DB 10 slice # 238 requires 6877001 BLOCKS.
running count of blocks BEFORE this slice = 3263379602
DB 10 slice # 239 requires 2493858 BLOCKS.
running count of blocks BEFORE this slice = 3270256603
DB 10 slice # 240 requires 2493858 BLOCKS.
running count of blocks BEFORE this slice = 3272750461
DB 10 slice # 241 requires 523711 BLOCKS.
running count of blocks BEFORE this slice = 3275244319
DB 10 slice # 242 requires 523711 BLOCKS.
running count of blocks BEFORE this slice = 3275768030
DB 10 slice # 243 requires 48492 BLOCKS.
running count of blocks BEFORE this slice = 3276291741
DB 10 slice # 244 requires 48492 BLOCKS.
running count of blocks BEFORE this slice = 3276340233
DB 10 slice # 245 requires 172309 BLOCKS.
running count of blocks BEFORE this slice = 3276388725
DB 10 slice # 246 requires 172309 BLOCKS.
running count of blocks BEFORE this slice = 3276561034
DB 10 slice # 247 requires 1357822 BLOCKS.
running count of blocks BEFORE this slice = 3276733343
DB 10 slice # 248 requires 1357822 BLOCKS.
running count of blocks BEFORE this slice = 3278091165
DB 10 slice # 249 requires 4733795 BLOCKS.
running count of blocks BEFORE this slice = 3279448987
DB 10 slice # 250 requires 4733795 BLOCKS.
running count of blocks BEFORE this slice = 3284182782
DB 10 slice # 251 requires 9579961 BLOCKS.
running count of blocks BEFORE this slice = 3288916577
DB 10 slice # 252 requires 9579961 BLOCKS.
running count of blocks BEFORE this slice = 3298496538
DB 10 slice # 253 requires 12397822 BLOCKS.
running count of blocks BEFORE this slice = 3308076499
DB 10 slice # 254 requires 12397822 BLOCKS.
running count of blocks BEFORE this slice = 3320474321
DB 10 slice # 255 requires 10635858 BLOCKS.
running count of blocks BEFORE this slice = 3332872143
DB 10 slice # 256 requires 10635858 BLOCKS.
running count of blocks BEFORE this slice = 3343508001
DB 10 slice # 257 requires 6045715 BLOCKS.
running count of blocks BEFORE this slice = 3354143859
DB 10 slice # 258 requires 6045715 BLOCKS.
running count of blocks BEFORE this slice = 3360189574
DB 10 slice # 259 requires 2194595 BLOCKS.
running count of blocks BEFORE this slice = 3366235289
DB 10 slice # 260 requires 2194595 BLOCKS.
running count of blocks BEFORE this slice = 3368429884
DB 10 slice # 261 requires 461364 BLOCKS.
running count of blocks BEFORE this slice = 3370624479
DB 10 slice # 262 requires 461364 BLOCKS.
running count of blocks BEFORE this slice = 3371085843
DB 10 slice # 263 requires 42770 BLOCKS.
running count of blocks BEFORE this slice = 3371547207
DB 10 slice # 264 requires 42770 BLOCKS.
running count of blocks BEFORE this slice = 3371589977
I then can determine if that block ID is already in the buffer or not by using another linked list I maintain.
-
- Posts: 62
- Joined: Mon Apr 20, 2009 01:10
- Contact:
Buffer diagrams
Another thing I did to help me understand how my buffer was performing, and how I might optimize it, was to create a utility for drawing the contents of my buffer. A snapshot from solving the 5-piece slice:
"XXXXXXX<-u d->0000195" means there is nothing being pointed to "up", and block number 195 is being pointed to "down"
"W015 W015 W005 W019" is a look into the first few values written into that block. Win in 15 plies, Win in 15 pies, Win in 5 plies, Win in 19 plies, etc.
Likewise "0000196<-u d->0000026" means it points "up" to block number 196, and down to block number 26.
Using this layout, I was able to change my indexing functions to make them "more localized", meaning the block would get reused as much as possible before being paged to disk.
My buffer efficiency for very large database operations, computed by "block reads in RAM divided by RAM plus disk reads", was 99.999998% and higher. This is almost as good as being able to compute everything in RAM.
Code: Select all
<----------TOP--------->
-----------------------------------------------------------------------------------------------------------------------------
| [0000196] || [0000195] || [0000026] || [0000194] || [0000193] |
| || || || || |
|XXXXXXX<-u d->0000195||0000196<-u d->0000026||0000195<-u d->0000194||0000026<-u d->0000193||0000194<-u d->0000025|
| || || || || |
| DB 05 SLICE 001 || DB 05 SLICE 001 || DB 05 SLICE 001 || DB 05 SLICE 001 || DB 05 SLICE 001 |
| || || || || |
| BLOCK # 000000196 || BLOCK # 000000195 || BLOCK # 000000178 || BLOCK # 000000194 || BLOCK # 000000177 |
| || || || || |
| W015 W015 W005 W019 || W025 ???? ???? ???? || ???? ???? ???? W005 || W015 W019 W023 W011 || ???? ???? ???? ???? |
| W009 W015 W007 W015 || ???? ???? ???? ???? || W021 ???? ???? ???? || W019 W019 W009 W023 || ???? ???? ???? ???? |
| W015 W009 W009 W001 || ???? ???? ???? ???? || ???? ???? ???? ???? || ???? ???? ???? ???? || ???? ???? ???? ???? |
| || || || || |
-----------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------
| [0000025] || [0000192] || [0000191] || [0000190] || [0000024] |
| || || || || |
|0000193<-u d->0000192||0000025<-u d->0000191||0000192<-u d->0000190||0000191<-u d->0000024||0000190<-u d->0000189|
| || || || || |
| DB 05 SLICE 001 || DB 05 SLICE 001 || DB 05 SLICE 001 || DB 05 SLICE 001 || DB 05 SLICE 001 |
| || || || || |
| BLOCK # 000000161 || BLOCK # 000000193 || BLOCK # 000000176 || BLOCK # 000000160 || BLOCK # 000000145 |
| || || || || |
| ???? ???? ???? ???? || W005 W019 W001 ???? || ???? ???? ???? ???? || ???? ???? ???? ???? || ???? ???? ???? ???? |
| ???? ???? ???? ???? || ???? ???? ???? ???? || ???? ???? ???? ???? || ???? ???? ???? ???? || ???? ???? ???? ???? |
| ???? ???? ???? ???? || ???? ???? ???? ???? || ???? ???? W003 W003 || ???? ???? ???? ???? || ???? ???? ???? ???? |
| || || || || |
-----------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------
| [0000189] || [0000188] || [0000187] || [0000183] || [0000186] |
| || || || || |
|0000024<-u d->0000188||0000189<-u d->0000187||0000188<-u d->0000183||0000187<-u d->0000186||0000183<-u d->0000182|
| || || || || |
| DB 05 SLICE 001 || DB 05 SLICE 001 || DB 05 SLICE 001 || DB 05 SLICE 001 || DB 05 SLICE 001 |
| || || || || |
| BLOCK # 000000192 || BLOCK # 000000175 || BLOCK # 000000159 || BLOCK # 000000158 || BLOCK # 000000144 |
| || || || || |
| ???? ???? ???? ???? || ???? ???? ???? ???? || ???? ???? ???? ???? || W011 W011 W011 W011 || ???? ???? ???? ???? |
| ???? ???? W013 ???? || ???? ???? ???? ???? || ???? ???? ???? ???? || W013 W011 W011 W007 || ???? ???? ???? ???? |
| ???? ???? ???? ???? || ???? W007 W007 ???? || ???? ???? ???? W007 || W011 W011 W011 W001 || ???? ???? ???? W033 |
| || || || || |
-----------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------
| [0000004] || [0000029] || [0000028] || [0000027] || [0000003] |
| || || || || |
|0000030<-u d->0000029||0000004<-u d->0000028||0000029<-u d->0000027||0000028<-u d->0000003||0000027<-u d->0000012|
| || || || || |
| DB 05 SLICE 001 || DB 05 SLICE 001 || DB 05 SLICE 001 || DB 05 SLICE 001 || DB 05 SLICE 001 |
| || || || || |
| BLOCK # 000000004 || BLOCK # 000000053 || BLOCK # 000000009 || BLOCK # 000000005 || BLOCK # 000000003 |
| || || || || |
| W011 W015 W017 W011 || W053 W053 W029 W049 || W049 W021 W055 W055 || W057 W057 W019 W019 || W047 W047 W047 W045 |
| W013 W013 W007 W011 || W051 W041 W053 W019 || W021 W055 W055 W053 || W051 W051 W031 W057 || W029 W035 W029 W023 |
| W015 W039 W013 W039 || W017 W017 W051 W051 || W057 W057 W057 W021 || W057 W019 W015 W051 || W037 W009 W017 W029 |
| || || || || |
-----------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------
| [0000012] || [0000002] || [0000001] || [0000000] || [NO DATA] |
| || || || || |
|0000003<-u d->0000002||0000012<-u d->0000001||0000002<-u d->0000000||0000001<-u d->XXXXXXX||-------<-u d->-------|
| || || || || |
| DB 05 SLICE 001 || DB 05 SLICE 001 || DB 05 SLICE 001 || DB 05 SLICE 001 || |
| || || || || |
| BLOCK # 000000027 || BLOCK # 000000002 || BLOCK # 000000001 || BLOCK # 000000000 || |
| || || || || |
| W035 W017 W013 W011 || W047 W027 DRAW W023 || W003 W003 W043 W027 || W005 W003 W007 W003 || |
| W035 W029 W045 W045 || W045 W015 W035 W007 || W049 W061 W013 W049 || W007 W011 W003 W011 || |
| W025 W007 W031 W031 || W007 W019 W049 W051 || W001 W017 W029 W049 || W007 W001 W015 W019 || |
| || || || || |
-----------------------------------------------------------------------------------------------------------------------------
<--------BOTTOM-------->
"XXXXXXX<-u d->0000195" means there is nothing being pointed to "up", and block number 195 is being pointed to "down"
"W015 W015 W005 W019" is a look into the first few values written into that block. Win in 15 plies, Win in 15 pies, Win in 5 plies, Win in 19 plies, etc.
Likewise "0000196<-u d->0000026" means it points "up" to block number 196, and down to block number 26.
Using this layout, I was able to change my indexing functions to make them "more localized", meaning the block would get reused as much as possible before being paged to disk.
My buffer efficiency for very large database operations, computed by "block reads in RAM divided by RAM plus disk reads", was 99.999998% and higher. This is almost as good as being able to compute everything in RAM.
-
- Posts: 859
- Joined: Sat Apr 28, 2007 14:53
- Real name: Ed Gilbert
- Location: Morristown, NJ USA
- Contact:
Gerard, I was wrong when I wrote this. I forgot that there is also another 32-bit integer for each 4K db block that indicates which cache block it is loaded into, or a special value if not in cache. So for every 4K region of the whole db there are two 32-bit numbers in memory, 8 bytes total. Sorry for my poor memory [img]images/smilies/icon_sad.gif[/img]I use only 32 bits per block, so it is not so much overhead.
-- Ed
Hi Ed.
The difference is very clear : you have the starting index where I have the adress on the disk.
That looks more understandable. So, for each block you have the starting index and the adress in memory (or block number). Thank you Ed. In the new approach I have in mind (blocks with a fixe number of positions, i.e. around 200K positions) I will also use 64 bits per block in order to store the adress on the disk and the adress in memory.Ed Gilbert wrote:Gerard, I was wrong when I wrote this. I forgot that there is also another 32-bit integer for each 4K db block that indicates which cache block it is loaded into, or a special value if not in cache. So for every 4K region of the whole db there are two 32-bit numbers in memory, 8 bytes total. Sorry for my poor memory [img]images/smilies/icon_sad.gif[/img]I use only 32 bits per block, so it is not so much overhead.
-- Ed
The difference is very clear : you have the starting index where I have the adress on the disk.
Gérard