Post
by BertTuyt » Sun Jan 29, 2017 12:22
Interesting discussion.
So herewith my 5 cents..
We lack so far a theoretical model for the strength of a draughts program in relation to the depth and strength of the evaluation.
Within this context, we also don't have a metric how to quantify the evaluation function of a program.
What we assume, and correct me when I'm wrong.
1) The game in itself is always a draw when 2 program play perfectly.
2) Rating (for now in line with chess, lets call it ELO), cannot grow to infinite, so I guess there is a maximum rating E.max
3) When a program reaches infinite depth it will play perfectly (even without evaluation knowledge), so for d --> infinite, E --> E.max
4) When a program has perfect knowledge it will play perfectly (even without searching), so for e ---> infinite , E --> E.max
As Joost pointed out, a more effective implementation of a program with a given evolution will play better.
On the other hand there might be a break even point if one adds more knowledge which reduced search depth, than the program overall might become weaker.
I don't know if the increase in performance (measured in ELO) is linear, and that the value (as we tested) is independent on evaluation implementation.
What we have seen in some tests, that definitely there seems to be some diminishing returns with increasing search depth.
It would be interesting to test this with the new programs like Argus, who are able to reach search depth into a domain we have not tested before.
We could post all kinds of functions , like E = E.max ( 1 - C*exp(-d/d0)*exp(-e/e0)), but as of now I have no clue.
Also still in the twilight zone, the maximum delta in ELO, compared with programs like Argus, Kingsrow, Scan, the depth when evaluation starts to become non-relevant (so the search sees the effects of locks, can calculate the final outcome of good/bad outposts), and to what extend (based upon an unknown metric), we are able to increase the evaluation strength with (for example) a factor of 2...
Bert