BertTuyt wrote:Joost, think this is a world record for speed........
Although partly cloning, it would be interesting to see what ELO you would get by a short-cut implementation based upon the tables of Fabien and the Databases of Ed.
I know that you will inject in the end 100% own code, but just as a yardstick.
Especially as we had a discussion some time ago, when we would believe perfect play would (more or less) take place.
Balpark figure 250 - 300 ELO points to gain compared with Kingsrow.
It would already be an achievement if you could prove that with this speed, one would grow +30 compared (and against Scan, 158 match 5-10 minutes game).
Note: In Damage I use a global Hash (with a Hyatt-type of lock) and Local History tables.
Bert
Bert,
It would not be very difficult to convert the evaluation values of Scan for use in my program, in this case I also have to use the same interpolation as Scan does, the point is that I use a different indexing scheme so converting this will take time and it might be wiser to spend this time on my own ML.
My hash-tables also use a Hyatt type of lock-free scheme, I xor the key with the data-field to verify that the read or write was atomic or not. Intel CPU's have an instruction for 128 bit atomic loads and stores 'CMPXCHG16B', that is another possibility.
C++11 has build-in atomics which I use for SMP, there is a possibility when you define a 128 bit atomic they will use the CPU instruction, another possibility is that they use a lock and that is something you don't want.
Joost
Edit:
I've been optimizing the SMP somewhat more and the speedup (time to depth) is now in the region of 5.3 to 5.8 on 8 cores, it depends a little bit upon the complexity of the position.
A split-depth of 6 or 7 seems to work best and the number of moves available before splitting should at least be 3, another restriction is that I never let more then 4 cores work together on the same split-point and that I never allow more than 4 split-points in the same thread (this will never happen with <= 8 cores though).
I tried several things like splitting only on all-nodes and splitting only when there are no captures etc. these things don't work at all, the branching factor in draughts is very low and that makes it difficult to keep the processors working, so you have to grab every opportunity to keep them busy.