Search Algorithm

Joost Buijs · Post by **Joost Buijs** » Sat Sep 10, 2016 14:30

Bert,

At the moment I store the usual things: depth, value, move and flags (Invalid = 0, Ubound = 1, Lbound = 2 and Exact = 3).
I use a simple formula to determine fitness of an entry: '4 * depth + flags;' which seems to work fine, I left 'age' out of the equation because I never found much use for it.

The 4 entry bucket system helps a little with a small hash-table and very long searches, with the large memory of modern computers it is not necessary and it gives some overhead as well.
What really helps is allocating the hash-table in large-page memory, for instance on Woldouby 23 ply (still without history) and 1 GB. hash a search takes 14.5 sec. normally and with LP it takes 12.8 sec., with larger hash-tables the difference is even greater.

Joost

Joost Buijs · Post by **Joost Buijs** » Sat Sep 10, 2016 19:18

To show hat there is really a big difference between small-page memory and large-page memory for the hash-table I did two tests, one with SP and one with LP memory.
In both cases the hash-table has a size of 8 GB., the evaluation is material only with the pattern evaluation in place but filled with zeros.
The nps goes from 10.7 mil. to 13 mil. a nice difference which is completely without drawback besides the fact that you have to run the engine as administrator.

Code: Select all


TT: num_buckets = 536870912, tt_size = 8589934592
Allocated TT with size 8589934592 in SP memory.

    -  -  -  -  -
  -  -  -  -  -
    -  m  m  m  -
  m  -  m  m  -
    m  -  m  m  M
  m  M  M  -  M
    -  M  M  M  M
  -  M  M  -  -
    -  -  -  -  -
  -  -  -  -  -

Depth = 24  Score = -100  Time = 28.40  Nodes = 303640850  Nps = 10692163.776989


TT: num_buckets = 536870912, tt_size = 8589934592
Allocated TT with size 8589934592 in LP memory.

    -  -  -  -  -
  -  -  -  -  -
    -  m  m  m  -
  m  -  m  m  -
    m  -  m  m  M
  m  M  M  -  M
    -  M  M  M  M
  -  M  M  -  -
    -  -  -  -  -
  -  -  -  -  -

Depth = 24  Score = -100  Time = 23.38  Nodes = 303640850  Nps = 12986125.555344

Joost Buijs · Post by **Joost Buijs** » Sun Sep 11, 2016 09:57

Yesterday I've added 2 killers in my move-picker which gives besides the hash-table a nice reduction in the branching-factor, even adding the second killer helps quite a lot.

After this I added history which gives a lot of overhead and as it seems now it does reduce the branching factor by such a small margin that it doesn't outweighs the additional overhead.
Unlike with chess history doesn't seem to help much with draughts, the hash-table and the 2 killers are already very effective, maybe I have to play with it a little longer but I get the impression that is better to throw history out completely.

It is possible that history doesn't do much because the positional evaluation table is still zero.

Any thoughts about this?

Joost

BertTuyt · Post by **BertTuyt** » Sun Sep 11, 2016 11:06

Joost,

in Damage I use History, and there I found that Killers did not add much to the equation

.
It could be a wrong implementation, or when search complexity increases (more scores values due to a more complex evaluation, LMR, ...) the outcome might be different.
I tested several options independently in the past, but combined I always encountered diminishing returns.

So I dont have a clear answer yet, in a perfect tree, I expect that the hashscore and hashmove could be sufficient.
As the first suggested move is already working for a CUT node, and for ALL nodes you need to search anyway all moves.

Also one of the things which was discussed (not sure fully implemented) is to add more moves in the hashtable.
Think it was studied by Breuker (not sure), and the outcome was, that it did not help.
But again most likely different mnps and hashtable sizes at that point in time.

I added a listing of Dwarf output in the Woldouby position.
For a 24 ply search Dwarf needed 1.806.178.806 nodes.

But not sure we are comparing apples and pears, would be good to sync here, so we can really compare implementations.

So here the implementation features (so far):
* A ply is always a ply, so even if there is only 1 move (capture) it is counted as a ply
* I added the nodes from the Q-Search in the total.
* Material only evaluation (same as you), 100 points for a man, 320 for a king.
* The search stops at ply 0 (including Q-Search) when the side to move has no capture (there are also implementations where there should also be no opposite capture thread).
* 1M entries Hashtable (1 bucket system).
* Only the move score is stored
* Plain alpha beta
* Only hash write for a beta cutoff, and always overwrite (so I dont have a priority based upon depth)
* beta cutoff based upon hashscore if hashscore >= beta && hashdepth >= searchdepth (one can debate to only use a cutoff for hashdepth == searchdepth)

Code: Select all

id name Dwarf 0.0
id author Draughts Community
ready
HashTable Allocated, Entries = 1048576
pos XW XB 10 10
info D 1 S 0 N 22 T 0 knps 0
info D 2 S 0 N 80 T 0.001 knps 80
info D 3 S 0 N 187 T 0.002 knps 93
info D 4 S 0 N 432 T 0.002 knps 216
info D 5 S 0 N 1021 T 0.02 knps 51
info D 6 S 0 N 1941 T 0.021 knps 92
info D 7 S 0 N 4029 T 0.022 knps 183
info D 8 S 0 N 7589 T 0.023 knps 329
info D 9 S 0 N 15086 T 0.024 knps 628
info D 10 S 0 N 30168 T 0.026 knps 1160
info D 11 S 0 N 64128 T 0.03 knps 2137
info D 12 S 0 N 115116 T 0.035 knps 3289
info D 13 S 0 N 246443 T 0.047 knps 5243
info D 14 S -100 N 509589 T 0.069 knps 7385
info D 15 S -100 N 969606 T 0.108 knps 8977
info D 16 S -100 N 2270953 T 0.208 knps 10918
info D 17 S -100 N 4529726 T 0.388 knps 11674
info D 18 S -120 N 9870413 T 0.769 knps 12835
info D 19 S -100 N 17575176 T 1.327 knps 13244
info D 20 S -100 N 46327144 T 3.259 knps 14215
info D 21 S -100 N 98866666 T 6.835 knps 14464
info D 22 S -100 N 242208193 T 15.63 knps 15496
info D 23 S -100 N 578163225 T 37.603 knps 15375
info D 24 S -100 N 1806178806 T 106.85 knps 16903

Time to depth for now is less relevant for me (in the comparison with you) as my notebook computer is slower (and you already prepared for a more complex evaluation), so we can compare nodes.

From here I can make small (suggested) changes and measure impact.

Bert

BertTuyt · Post by **BertTuyt** » Sun Sep 11, 2016 11:22

Small example,

when I double the HashTable in Dwarf (so from 2^20 to 2^21) the nodes decrease (for the 24 ply Woldouby search) to 1.750.654.678, so around a 3% decrease.

EDIT:
And when I combine the larger table with a hashwrite always (so not related to a beta cut), 1.670.230.192, which is a further 5% decrease

Bert

BertTuyt · Post by **BertTuyt** » Sun Sep 11, 2016 11:47

Forgot to mention, during the Q-Search phase I dont store results in the HashTable (so only during the regular search).

Bert

Fabien Letouzey · Post by **Fabien Letouzey** » Sun Sep 11, 2016 12:09

Joost Buijs wrote:It is possible that history doesn't do much because the positional evaluation table is still zero.

Any thoughts about this?

Yes: keep the history code nearby. My results are in line with Bert's: killers never brought anything, while history is very measurable.

Testing move ordering with a material-only evaluation is risky business of course. This could inflate results for killers.

Following Othello, you might want to try some form of response killers (they make more sense for blocking games). I haven't.

Fabien.

Joost Buijs · Post by **Joost Buijs** » Sun Sep 11, 2016 12:26

Bert,

My conditions are almost the same, I count nodes the same, I have set the hash-table to a single entry per bucket, then it will probably behave the same as your implementation and I don't use the hash-table in quiescence.

The differences are 2 killers per ply and that I store the best move in the hash-table as well.
Other differences are that I store each node in the hash-table with bounds (upper, lower, exact), and my search is not iterative yet, which has probably a negative influence on the branching factor.

I first want to solve that history mystery by putting something sensible in my evaluation table and if that doesn't help I throw history out because I like simple and neat solutions and a history table makes the move-picker very messy and slow to say the least.

Joost

Somehow Dwarf uses 6 times as much nodes. These are my last results with history disabled and 1 GB hash:

Code: Select all


TT: num_buckets = 67108864, tt_size = 1073741824
Allocated TT with size 1073741824 in SP memory.

    -  -  -  -  -
  -  -  -  -  -
    -  m  m  m  -
  m  -  m  m  -
    m  -  m  m  M
  m  M  M  -  M
    -  M  M  M  M
  -  M  M  -  -
    -  -  -  -  -
  -  -  -  -  -

Depth = 24  Score = -100  Time = 25.90  Nodes = 304004161  Nps = 11737093.018335

Joost Buijs · Post by **Joost Buijs** » Sun Sep 11, 2016 12:46

Fabien Letouzey wrote:
Joost Buijs wrote:It is possible that history doesn't do much because the positional evaluation table is still zero.

Any thoughts about this?
Yes: keep the history code nearby. My results are in line with Bert's: killers never brought anything, while history is very measurable.

Testing move ordering with a material-only evaluation is risky business of course. This could inflate results for killers.

Following Othello, you might want to try some form of response killers (they make more sense for blocking games). I haven't.

Fabien.

Hoi Fabien,

Yes I agree that I have to put some sensible values into my evaluation tables before I can continue testing move-ordering, and there are still many other things to do.

I'm busy with draughts for 4 months now (since an absence of 24 years) and it will take some time to catch up.
Somehow I lost interest in chess-programming completely, and for me draughts is something new to discover.

Joost

Ed Gilbert · Post by **Ed Gilbert** » Sun Sep 11, 2016 13:43

My experience is similar to Bert's and Fabien's. In kingsrow the history table is a substantial benefit in reducing nodes searched. Killer moves hardly help at all, and a second killer move was no help.

In 8x8 English checkers I created a static history table, actually 8 of them -- 4 tables for man-only positions and 4 tables for positions with kings. Each table is for a range of 6 in number of pieces, so for example one table for 19 through 24 pieces, ... The history was acquired during an engine match of a few hundred games. This table works better than the more conventional dynamic history table. I have gone back to retest this several times, and it's always the same result. I never tried it with 10x10 because I don't think it would work well there, the game is too different. 8x8 checkers is more constrained.

-- Ed

Joost Buijs · Post by **Joost Buijs** » Sun Sep 11, 2016 14:02

Ed Gilbert wrote:My experience is similar to Bert's and Fabien's. In kingsrow the history table is a substantial benefit in reducing nodes searched. Killer moves hardly help at all, and a second killer move was no help.

In 8x8 English checkers I created a static history table, actually 8 of them -- 4 tables for man-only positions and 4 tables for positions with kings. Each table is for a range of 6 in number of pieces, so for example one table for 19 through 24 pieces, ... The history was acquired during an engine match of a few hundred games. This table works better than the more conventional dynamic history table. I have gone back to retest this several times, and it's always the same result. I never tried it with 10x10 because I don't think it would work well there, the game is too different. 8x8 checkers is more constrained.

-- Ed

My best guess is that I first have to implement all other features like pruning and a sensible evaluation function before I can say anything about it.
I tried to put some small random values in my evaluation table instead of zero's to increase diversity, but the results remain the same, at best the history breaks even with the overhead it gives.
I checked everything very carefully and I can't find any errors in my implementation, it completely puzzles me.

Joost

BertTuyt · Post by **BertTuyt** » Sun Sep 11, 2016 14:37

Ed,

a static table is an interesting concept.
Maybe it would be already sufficient if the table only deals with man moves.
Think there are only 9 * 9 possible man moves for every color (every row 9 moves, and the promotion row doesnt count).

Bert

Joost Buijs · Post by **Joost Buijs** » Sun Sep 11, 2016 14:47

BertTuyt wrote:
But not sure we are comparing apples and pears, would be good to sync here, so we can really compare implementations.

So here the implementation features (so far):
* A ply is always a ply, so even if there is only 1 move (capture) it is counted as a ply
* I added the nodes from the Q-Search in the total.
* Material only evaluation (same as you), 100 points for a man, 320 for a king.
* The search stops at ply 0 (including Q-Search) when the side to move has no capture (there are also implementations where there should also be no opposite capture thread).
* 1M entries Hashtable (1 bucket system).
* Only the move score is stored
* Plain alpha beta
* Only hash write for a beta cutoff, and always overwrite (so I dont have a priority based upon depth)
* beta cutoff based upon hashscore if hashscore >= beta && hashdepth >= searchdepth (one can debate to only use a cutoff for hashdepth == searchdepth)

Bert

I forgot to mention that I already added a null-window search, basically it is PVS now but I still have to add iteration and aspiration to it, maybe this accounts for the huge difference with Dwarf.

Edit:

I've added a very simple form of iteration and aspiration, the results look rather good, I have the impression that my hash-table implementation is very effective.

Code: Select all


TT: num_buckets = 16777216, tt_size = 1073741824
Allocated TT with size 1073741824 in SP memory.

    -  -  -  -  -
  -  -  -  -  -
    -  m  m  m  -
  m  -  m  m  -
    m  -  m  m  M
  m  M  M  -  M
    -  M  M  M  M
  -  M  M  -  -
    -  -  -  -  -
  -  -  -  -  -

depth  1  score     0  time   0.00  nodes        14  nps  390625
depth  2  score     0  time   0.00  nodes        47  nps  341676
depth  3  score     0  time   0.00  nodes       106  nps  236337
depth  4  score     0  time   0.00  nodes       192  nps  399503
depth  5  score     0  time   0.00  nodes       456  nps  787235
depth  6  score     0  time   0.00  nodes       816  nps 1252948
depth  7  score     0  time   0.00  nodes      1695  nps 2154369
depth  8  score     0  time   0.00  nodes      2921  nps 3110729
depth  9  score     0  time   0.00  nodes      5786  nps 4720459
depth 10  score     0  time   0.00  nodes      9714  nps 5853348
depth 11  score     0  time   0.00  nodes     18347  nps 7591945
depth 12  score     0  time   0.00  nodes     29778  nps 8293585
depth 13  score     0  time   0.01  nodes     55301  nps 9877133
depth 14  score  -100  time   0.02  nodes    224104  nps 10801082
depth 15  score  -100  time   0.03  nodes    295634  nps 11312597
depth 16  score  -100  time   0.04  nodes    405073  nps 11320155
depth 17  score  -100  time   0.05  nodes    601476  nps 12001120
depth 18  score  -100  time   0.07  nodes    831874  nps 11610857
depth 19  score  -100  time   0.12  nodes   1368915  nps 11740283
depth 20  score  -100  time   0.18  nodes   2030094  nps 11204291
depth 21  score  -100  time   0.31  nodes   3555456  nps 11347430
depth 22  score  -100  time   0.67  nodes   7372390  nps 10936320
depth 23  score  -100  time   1.17  nodes  13048894  nps 11121231
depth 24  score  -100  time   1.79  nodes  19291202  nps 10792338
depth 25  score  -100  time   3.41  nodes  38107179  nps 11178210
depth 26  score  -100  time   5.37  nodes  58022028  nps 10813837
depth 27  score  -100  time  11.71  nodes 130945734  nps 11183798
depth 28  score  -100  time  18.06  nodes 195024270  nps 10801120
depth 29  score  -100  time  36.25  nodes 393461825  nps 10855354
depth 30  score  -100  time  60.13  nodes 629381061  nps 10467119

Here the same but with the evaluation tables filled with small random numbers from -32 to +32, the tree gets about twice as large as before which is to be expected.

Code: Select all


TT: num_buckets = 16777216, tt_size = 1073741824
Allocated TT with size 1073741824 in SP memory.

    -  -  -  -  -
  -  -  -  -  -
    -  m  m  m  -
  m  -  m  m  -
    m  -  m  m  M
  m  M  M  -  M
    -  M  M  M  M
  -  M  M  -  -
    -  -  -  -  -
  -  -  -  -  -

depth  1  score    54  time   0.00  nodes         22  nps  4603794
depth  2  score    54  time   0.00  nodes         40  nps   356193
depth  3  score    54  time   0.00  nodes         96  nps   232823
depth  4  score   -17  time   0.00  nodes        252  nps   567909
depth  5  score    -4  time   0.00  nodes        527  nps  1090357
depth  6  score   -30  time   0.00  nodes        984  nps  1818809
depth  7  score   -22  time   0.00  nodes       1971  nps  3109539
depth  8  score   -54  time   0.00  nodes       3524  nps  4432897
depth  9  score   -32  time   0.00  nodes       7143  nps  6435041
depth 10  score   -79  time   0.00  nodes      12987  nps  7528264
depth 11  score   -22  time   0.00  nodes      28521  nps  9732976
depth 12  score   -32  time   0.00  nodes      39877  nps  9857166
depth 13  score    -6  time   0.01  nodes      67823  nps 10777835
depth 14  score   -96  time   0.01  nodes     163736  nps 11006476
depth 15  score   -20  time   0.02  nodes     296383  nps 11857291
depth 16  score  -126  time   0.06  nodes     633255  nps 11246531
depth 17  score   -59  time   0.11  nodes    1309694  nps 11896511
depth 18  score   -96  time   0.14  nodes    1613675  nps 11792756
depth 19  score   -79  time   0.22  nodes    2556147  nps 11587391
depth 20  score  -134  time   0.38  nodes    4374822  nps 11502348
depth 21  score   -96  time   0.70  nodes    8009070  nps 11457232
depth 22  score  -144  time   1.05  nodes   11892274  nps 11291813
depth 23  score   -97  time   2.07  nodes   23110951  nps 11185538
depth 24  score  -128  time   2.83  nodes   31004860  nps 10972033
depth 25  score  -117  time   4.75  nodes   52611246  nps 11084726
depth 26  score  -140  time   8.09  nodes   87882525  nps 10864876
depth 27  score  -104  time  14.93  nodes  164842656  nps 11042734
depth 28  score  -133  time  36.85  nodes  396376431  nps 10757066
depth 29  score  -116  time  73.54  nodes  781962053  nps 10633130
depth 30  score  -144  time 124.60  nodes 1312617330  nps 10534831

Joost Buijs · Post by **Joost Buijs** » Mon Sep 12, 2016 08:12

Ed Gilbert wrote:My experience is similar to Bert's and Fabien's. In kingsrow the history table is a substantial benefit in reducing nodes searched. Killer moves hardly help at all, and a second killer move was no help.

In 8x8 English checkers I created a static history table, actually 8 of them -- 4 tables for man-only positions and 4 tables for positions with kings. Each table is for a range of 6 in number of pieces, so for example one table for 19 through 24 pieces, ... The history was acquired during an engine match of a few hundred games. This table works better than the more conventional dynamic history table. I have gone back to retest this several times, and it's always the same result. I never tried it with 10x10 because I don't think it would work well there, the game is too different. 8x8 checkers is more constrained.

Indeed it depends very much upon the position, I'm testing on the Woldouby because that looks to me a typical midgame position but now I clearly see that history works better at e.g. the starting position. To tune move-ordering I need a small database of positions from different game-stages, which I don't have, since my engine is still in an infantile state I guess it is better to continue with implementing IO, evaluation and SMP before trying to tune move-ordering.

A static history table is certainly something to consider, at the moment I use a fast sigmoid function to update the history-table and that gives additional overhead, it would be nice if this could be avoided completely. For move sorting I use an insertion-sort which is not cheap either, maybe there are better ways to do this.

Joost

Edit:

I finally solved the puzzle, it was a bug in the insertion sort, it had a tendency to put the best move at the second spot, I overlooked this several times because on many occasions the first and second move have the same history values. This happens when you want to write everything yourself and have been sitting too many hours behind the computer.

Now I see the same behavior as you, Fabien and Bert, on the starting position 1 killer still helps, with 2 killers the performance goes down a bit, I have to add to this that on the Woldouby the second killer still shaves off 17.5% of the search tree.

Compared to the test above the 30 ply search takes now 50.83 sec. instead of 60.13 sec.

Code: Select all


TT: num_buckets = 16777216, tt_size = 1073741824
Allocated TT with size 1073741824 in SP memory.

    -  -  -  -  -
  -  -  -  -  -
    -  m  m  m  -
  m  -  m  m  -
    m  -  m  m  M
  m  M  M  -  M
    -  M  M  M  M
  -  M  M  -  -
    -  -  -  -  -
  -  -  -  -  -

depth 24  score  -100  time   1.44  nodes   15118089  nps 10463924
depth 25  score  -100  time   2.43  nodes   24968485  nps 10264117
depth 26  score  -100  time   4.22  nodes   44160117  nps 10463409
depth 27  score  -100  time   8.17  nodes   83247905  nps 10194984
depth 28  score  -100  time  15.18  nodes  157726520  nps 10389472
depth 29  score  -100  time  28.64  nodes  288649885  nps 10080026
depth 30  score  -100  time  50.83  nodes  518920781  nps 10209554

BertTuyt · Post by **BertTuyt** » Tue Sep 13, 2016 20:40

Joost, impressive results.

Think if you would add the 8P DB from Ed and DB-handler, you would end up with a 0 score

Would be interested what the current computer draughts world record is for solving Woldouby, think that ED has the record but I dont know timing involved.

As Im travelling, I will next weekend implement minimal window search, to compare with your results.

Bert

World Draughts Forum

Search Algorithm

Re: Search Algorithm

Re: Search Algorithm

Re: Search Algorithm

Re: Search Algorithm

Re: Search Algorithm

Re: Search Algorithm

Re: Search Algorithm

Re: Search Algorithm

Re: Search Algorithm

Re: Search Algorithm

Re: Search Algorithm

Re: Search Algorithm

Re: Search Algorithm

Re: Search Algorithm

Re: Search Algorithm