Search Algorithm

Rein Halbersma · Post by **Rein Halbersma** » Sat Jan 12, 2013 00:24

Piet Bouma wrote: Michael Palmer has published an endgame-analysis of the game Rick Twilhaar - Dinant Spieker:
Toernooibase.
It seems that Kingsrow does not recognize the blockade-system in the endgame (8 pieces) according to a remark of Tjalling Goedemoed.
So there will be positions in the endgame where human can beat the computer......??

It is not a good idea to analyze with Flits, and then proclaim that humans still beat computers.

The proposed variation 46. 24-20 15x24 47. 29x9 13x4 48. 25-20 2-7 49. 20-14 7-12 50. 39-34 22-27 51. 34-29 17-21 52. 29-23 18x29 53. 33x24 21-26 54. 28-23 27-32 55. 38x27 31x22 56. 41-37 22-27 57. 23-19

is indeed losing for black after the proposed 57... 26-31? But Kingsrow plays 57... 27-32! 58. 37x28 26-31 with a database draw. In fact, it sees the draw already after 49. 20-14. In fact, the whole variation after 46. 24-20 is a draw...

Kingsrow also sees the kings sacrifice in the diagram below

31-37? 3-26! 37x46 26-37!! 32x41 W+

Perhaps Ed can confirm this, but my suspicion is that Tjalling's computer does not have enough memory to hold most of the relevant 8 piece databases. When a position is not in memory, Kingsrow will decide whether to load it from disk. However, it will take into account the likelihood that the position will influence the analysis. I am guessing that it will not load the position after the sacrifice because it thinks it will not win for white. On my pc with 24 Gb of RAM, the sacrifice is seen immediately.

Ed Gilbert · Post by **Ed Gilbert** » Sat Jan 12, 2013 00:59

Perhaps Ed can confirm this, but my suspicion is that Tjalling's computer does not have enough memory to hold most of the relevant 8 piece databases. When a position is not in memory, Kingsrow will decide whether to load it from disk. However, it will take into account the likelihood that the position will influence the analysis. I am guessing that it will not load the position after the sacrifice because it thinks it will not win for white.

I tried this position on my 7-year old P4 with very little RAM, and a small hard drive that can only hold less than 1/2 of the full 8-piece db. Kingsrow sees the white win immediately.

-- Ed

mikkeltje · Post by **mikkeltje** » Sat Jan 12, 2013 07:56

Rein Halbersma wrote:
Piet Bouma wrote: Michael Palmer has published an endgame-analysis of the game Rick Twilhaar - Dinant Spieker:
Toernooibase.
It seems that Kingsrow does not recognize the blockade-system in the endgame (8 pieces) according to a remark of Tjalling Goedemoed.
So there will be positions in the endgame where human can beat the computer......??
It is not a good idea to analyze with Flits, and then proclaim that humans still beat computers.

The proposed variation 46. 24-20 15x24 47. 29x9 13x4 48. 25-20 2-7 49. 20-14 7-12 50. 39-34 22-27 51. 34-29 17-21 52. 29-23 18x29 53. 33x24 21-26 54. 28-23 27-32 55. 38x27 31x22 56. 41-37 22-27 57. 23-19

is indeed losing for black after the proposed 57... 26-31? But Kingsrow plays 57... 27-32! 58. 37x28 26-31 with a database draw. In fact, it sees the draw already after 49. 20-14. In fact, the whole variation after 46. 24-20 is a draw...

Kingsrow also sees the kings sacrifice in the diagram below

31-37? 3-26! 37x46 26-37!! 32x41 W+

Perhaps Ed can confirm this, but my suspicion is that Tjalling's computer does not have enough memory to hold most of the relevant 8 piece databases. When a position is not in memory, Kingsrow will decide whether to load it from disk. However, it will take into account the likelihood that the position will influence the analysis. I am guessing that it will not load the position after the sacrifice because it thinks it will not win for white. On my pc with 24 Gb of RAM, the sacrifice is seen immediately.

Very nice and subtle difference 27-32 (for Flits also a database draw) or 26-31. As a human I assumed (never assume!) it would end up the same endgame. With the knowledge of today I understand that the piece on 26 is a huge difference with a piece on 28. Will make a note on that in the analysis.

So basically Kingsrow saw a "better" draw in the initial position or did I miss another win in the position?

Another question: is 49. .... 7-11? the wrong move and after that move it is a win for Kingsrow? (the draw Kingsrow found is in the variation 56. ..... 2-7/8 57. ..... 7-12!)

Secondly: Rein did put a question mark behind 31-37 which would imply there is a better move in that position, I would not expect that.

After 31-37: if Kingsrow is playing without database, can it calculate the win (dragon is evaluating 26-37 without database as -1.136, proposing 39-33, Flits is also proposing 39-33)? I am asking that because Rein made the remark "I am guessing that it will not load the position after the sacrifice because it thinks it will not win for white", but it does with a database.

Last, what do you think about computer - human matches/strength without computer databases (certainly endgame and possibly without opening book), just on calculation?

Ed Gilbert · Post by **Ed Gilbert** » Sat Jan 12, 2013 14:57

After 31-37: if Kingsrow is playing without database, can it calculate the win (dragon is evaluating 26-37 without database as -1.136, proposing 39-33, Flits is also proposing 39-33)? I am asking that because Rein made the remark "I am guessing that it will not load the position after the sacrifice because it thinks it will not win for white", but it does with a database.

I turned off the use of endgame databases in kingsrow, and let it search the position before 31-37. It sees a white win after about 20 seconds of searching. It sees this because at search depth of 28 plies it sees the end of the game.

-- Ed

BertTuyt · Post by **BertTuyt** » Sun Jan 13, 2013 01:37

In the mean time I have started the 24 ply search.
After 25 1/2 hours, 30 games, 8 Wins, 19 Draws, and 3 unknowns.

Bert

BertTuyt · Post by **BertTuyt** » Sun Jan 13, 2013 22:28

In a previous post I mentioned that initially the 158 game matches with variable ply depth with Damage were played with DEBUG mode, to test the (updated) parallel search routine.
In 2 occasions an ASSERT occurred. One was related to a bug in the Principal Variation, the other one i could not trace, and is the one I want to discuss now..

As the EndGame DB is too large, I use an internal DB cache , which for Damage is 4GByte, consisting of 1M 4KByte Cache blocks.
In the situation that a specific 4 KByte block is not loaded (yet) the cache handler replace the LRU (Least Recently Used ) cache block.
To optimize transfer to the main memory the EndGame DB is stored on a Solid State Disk (SSD), in stead of a (normal) Hard Disk.

Every 4 KByte block contains a small table ( 2 bytes for every table-entry) at the beginning with the address offset of the next 4K entries.
As the WDL values are compressed this table will slightly improve the de-compression speed (and as compression is used a 4 KByte block can/will contain in general far more than 4K positions).
During initial DEBUG of this routine, I have included an ASSERT to test if the offset was smaller than 4 KByte, as the situation that the DB-position is not included in this block would not be possible (with the boundary condition that the pre-process and block identification was ok so far).

The specific ASSERT which stopped the program, was related to this test.
So the possible options are:
* Pre-process was wrong.
* The processor did a wrong memory read (although the memory value was ok).
* The block was corrupted by a read from another part of the program.
* The transfer from SSD to memory was wrong.
* The information on the SSD is already corrupted.

As the part where Damage does a DB -read is a critical section, the situation that another part of the process modifies the block while a read is executed can not take place ( I hope so

).

As I was able to pin down on the specific Database position i could do a re-read later, which indicated no problem, so therefore it seems logical to assume that the content on the SSD was not the reason for the ASSERT.

Unfortunately (with hindsight one always know what to do better) i did not write down the specific 2 bytes index value when the ASSERT took place, and therefore was not able to compare these with the second read, which could reveal already some clues.

At this point I guess that the most plausible explanation is that (for whatever) reason an error occurred during SSD access.
I guess that these errors are not frequent, but when the computer is running day and night for 2 weeks, the chance might not be zero.

Basically i dont care that once in a while an error is introduced this way, as the chance that this will impact the tree result is small.
On the other hand, I dont like the program to crash if this situation occurs.
As the ASSERTs are not compiled in RELEASE mode, and so far i didn't see crashes lately, i dont have an insight in frequency.

So im thinking to also include some tests in the RELEASE mode (and especially in the DB-handler) , to avoid these surprises.
One can think, given the example, to do a read twice if the index is outside the 4 KByte range, or to re-read the specific block.

My question, for those who dont have ECC Memory, whether you know/recognize these types of random failures (cosmic rays, quantum mechanics or whatever), did you also see them in your program, and did you include some extra test mechanism in the search to detect and avoid them...

Bert

Rein Halbersma · Post by **Rein Halbersma** » Mon Jan 14, 2013 10:43

BertTuyt wrote:In a previous post I mentioned that initially the 158 game matches with variable ply depth with Damage were played with DEBUG mode, to test the (updated) parallel search routine.
In 2 occasions an ASSERT occurred. One was related to a bug in the Principal Variation, the other one i could not trace, and is the one I want to discuss now..

[snip]

Bert

Hi Bert,

Some random thoughts on how to debug this.
- did you exhaustively scan your entire DB and check that all the 2 byte values contain only offsets that keep within each 4Kb block? This might take a while but should verify consistency of your SSD.
- an even more thorough check might be to loop over the entire DB and try to load every single block into memory (flushing the LRU entry after you hit your memory constraint).
- it might be a good idea to put a log statement inside your ASSERT macro that writes the ASSERT message to a file, so that you can read it later.
- it might also be a good idea to write unit tests on your access functions (e.g. iterating over an entire db, indexing/ de-indexing, etc.)
- I wouldn't think that random hardware errors are the top candidates for your error.

Rein

BertTuyt · Post by **BertTuyt** » Mon Jan 14, 2013 22:33

After around 72 hours ( 3 days, and 3 more days to go) the 24 ply match results so far:
81 games, 21 Win, 53 Draw, 7 Unknown.
Based on (excluding the Unknowns which will impact final result) the current ELO is around -101 .

Bert

BertTuyt · Post by **BertTuyt** » Mon Jan 14, 2013 22:37

Hi Bert,

Some random thoughts on how to debug this.
- did you exhaustively scan your entire DB and check that all the 2 byte values contain only offsets that keep within each 4Kb block? This might take a while but should verify consistency of your SSD.
- an even more thorough check might be to loop over the entire DB and try to load every single block into memory (flushing the LRU entry after you hit your memory constraint).
- it might be a good idea to put a log statement inside your ASSERT macro that writes the ASSERT message to a file, so that you can read it later.
- it might also be a good idea to write unit tests on your access functions (e.g. iterating over an entire db, indexing/ de-indexing, etc.)
- I wouldn't think that random hardware errors are the top candidates for your error.

Rein

Rein, thanks for yor suggestions, after some more Matches I will most likely implement a few ideas.
By the way, I reloaded the specific block which caused the specific fault in a later debug session, and this time the error did not occur.
So at least the info (for this specific case) on the SSD seemed to be ok...

But you are right, before jumping into cosmic rays, we need to explore the more obvious fault mechanisms...

Bert

BertTuyt · Post by **BertTuyt** » Mon Jan 14, 2013 23:10

I tried to do some curve fitting on the current data.
As I expect a maximum rating (or pScore), and diminishing returns, I applied the next formula.

pScore = a * ( b - exp ( c * depth ) )

The values found were: a = 1,2665423, b = 0,675251 c - -0,077102341
The regression coefficient 0,99.

With these values I "predicted" the ELO rating for depth = 24 (currently in progress) and higher "virtual" depths.
To my surprise the ELO tops at around 300.
That the ELO tops is a consequence of the formula applied, but i expected something around 150.

See below table.
This does not reflect any underlying theory, just an interesting exercise....

Code: Select all

depth	  a	       b	         c	           p	     ELO
24	1,2665423	0,675251	-0,077102341	0,65617726	 -112
25	1,2665423	0,675251	-0,077102341	0,670948241	-124
30	1,2665423	0,675251	-0,077102341	0,729900774	-173
35	1,2665423	0,675251	-0,077102341	0,769994541	-210
40	1,2665423	0,675251	-0,077102341	0,797262413	-238
45	1,2665423	0,675251	-0,077102341	0,815807362	-259
50	1,2665423	0,675251	-0,077102341	0,828419829	-274
55	1,2665423	0,675251	-0,077102341	0,8369976	  -284
60	1,2665423	0,675251	-0,077102341	0,842831364	-292
1000 1,2665423	0,675251	-0,077102341	0,855233955	-309

Bert

BertTuyt · Post by **BertTuyt** » Mon Jan 14, 2013 23:33

Last but not least..

If we assume that pScore = ( 1 + pError * M ) / 2, with M = 30 moves.
Then we get a pError of 0,02 based on a pScore of 0,855 (perfect play at ply 1000

).

Bert

MichelG · Post by **MichelG** » Tue Jan 15, 2013 08:45

BertTuyt wrote:I tried to do some curve fitting on the current data.
As I expect a maximum rating (or pScore), and diminishing returns, I applied the next formula.

pScore = a * ( b - exp ( c * depth ) )

The values found were: a = 1,2665423, b = 0,675251 c - -0,077102341
The regression coefficient 0,99.

With these values I "predicted" the ELO rating for depth = 24 (currently in progress) and higher "virtual" depths.
To my surprise the ELO tops at around 300.
That the ELO tops is a consequence of the formula applied, but i expected something around 150.

That's indeed surprising. I will try to reproduce some of this in Dragon. However dragon won't be able to achieve such high depths as you did with dragon. (I guess it is a bit less selective)

Results up to now. Kingsrows plays on 2 cores of a 2 Ghz laptop with 4 pc databases, while dragon plays to the specified depth, with more databases. Unknowns are ignored. (nb: this is not meant to give a fair comparison between dragon and kingsrow)

Code: Select all

Depth	ELO		W	  L	  D	  U	 P
6	    280      99	2	  44	 	0,17
8	    159      58	1	  74	 	0,29

Michel

BertTuyt · Post by **BertTuyt** » Tue Jan 15, 2013 20:09

Results up to now. Kingsrows plays on 2 cores of a 2 Ghz laptop with 4 pc databases, while dragon plays to the specified depth, with more databases. Unknowns are ignored. (nb: this is not meant to give a fair comparison between dragon and Kingsrow)

Michel, first of all I agree with your statement , the tests which I do is not meant as a comparison between Kingsrow and Damage. I consider Kingsrow still the (slightly) better program.
Do you also apply 1 Min as Game time for Kingsrow?

I'm really interested to see your results.
Think the curve will be different. As you use a less selective search, I would expect that diminishing returns start somewhat later.
Next to that I'm curious, if the fitted curve (in your case) can be approximated by the same formula as I use, and what it implies for ELO rating in case the depth becomes infinite.

After the 24 ply search (which is progressing well), I want to focus on 2 new test sets:
1) Tests matches, where Kingsrow has 10 Min/Game.
2) Test Matches, with the 1 Min/Game, but with all Damage search enhancements switched off (MCP, FHR, LMR), as proposed by Rein. I expect the curve to be different, but basically the maximum ELO ( for large depth) should be the same as with the current set-up.

Last but not least, when all the tests have been completed, I want to (better) understand, why and when ( specific game phase) Damage wins, and how I can find similar solutions within the evaluation without going as deep...

Bert

BertTuyt · Post by **BertTuyt** » Tue Jan 15, 2013 20:55

For the curve fitting I used the (30 days free trail) program CurveExpert Professional.
See attached screen shot, with also the fitted curve.

Bert

MichelG · Post by **MichelG** » Thu Jan 17, 2013 08:39

BertTuyt wrote: Michel, first of all I agree with your statement , the tests which I do is not meant as a comparison between Kingsrow and Damage. I consider Kingsrow still the (slightly) better program.
Do you also apply 1 Min as Game time for Kingsrow?

I'm really interested to see your results.

I have kingsrow at 1 minute/game (its doing about 4000-5000 Knodes/second). Results up to now are:

Code: Select all

Depth	ELO		W	  L	  D	  U	 P
6       280      99	 2     44	    0,17
8       159      58	 1	  74	    0,29
10       75      35    6     96       0,39
12        3      12   11     79       0,50 (partial result)

p looks fairly linear until now (ofcourse that can't continue for long). Playing higher levels will be hard, as i only have kingsrow running at my laptop, and it needs to shut down at regular intervals.

Does your curve fitting program allow to fit different types of curves? Because p can't be less than 0, the exponential model has a problem with the lower depths.

One possible model i am thinking about is this.
move error rate damage=factorA*exp(-factorB*depth)
move error rate kingsrow=factorC
p=pfunction(move error rate damage,kingsrow error rate), with pfunction from the program in the earlier post.

However this model is a bit hard to work with

Michel

World Draughts Forum

Search Algorithm

Re: Search Algorithm

Re: Search Algorithm

Re: Search Algorithm

Re: Search Algorithm

Re: Search Algorithm

Re: Search Algorithm

Re: Search Algorithm

Re: Search Algorithm

Re: Search Algorithm

Re: Search Algorithm

Re: Search Algorithm

Re: Search Algorithm

Re: Search Algorithm

Re: Search Algorithm

Re: Search Algorithm