Handling search inconsistencies in MTD(f)

Discussion about development of draughts in the time of computer and Internet.
TAILLE
Posts: 968
Joined: Thu Apr 26, 2007 18:51
Location: FRANCE

Re: Handling search inconsistencies in MTD(f)

Post by TAILLE »

jj wrote:
TAILLE wrote:
jj wrote:So how do you select the best move if the last pass fails low? Or do you re-search until you get a fail high again?
In this scenario after having run rootMT(root, moves, g(k-2),d) I obtained low = g(k-2) and upp = +INF
then after having run rootMT(root, moves, g(k-1), d) I obtained low = g(k-2) and upp = g(k-1);
At this stage the logical continuation is to run the test : rootMT(root, moves, (g(k-1) + g(k-2))/2, d).
You are describing the process of choosing gamma. I'm asking what happens when you completed the last pass (edit: when low >= upp). If the last pass failed low, which move do you return as best move?
Basically you continue running test until low == upp and the best move is a move which assures low value.
However, for my point of view it is not a good idea to wait until low == upp.
In my understanding one of the great differences between alpha-beta and MTD(f) is the following : while alpha-beta seems better for evaluating a position, MTD(f) seems better for finding the best move.
Let's consider a test value gamma. Suppose that by testing the first move m1 you reach a "fail high". That means that the value of m1 is at least equal to gamma. You can of course conclude that the value of the intitial position is at least gamma and as a consequence you can increase the value of gamma in order to find a better value of the initial position; but you can also search for another move in order to look for a second "fail high" with the same test value gamma. If this second "fail high" does not exist then you know move m1 is the best move, even if gamma is far from the best value!
If you find a second "fail high" you know that you have to continue the process by increasing gamma. In order to try to be able to choose quickly between m1 and m2 you will choose the new gamma according to the two return values corresponding on the two "fail high".
As you see you continue until low == upp only if the best two moves have almost the same value. In all other cases you can expect to find the best move more quickly than alpha-beta.
In summary, in my implementation the best move is the move which causes a "fail high" against a test value gamma when all other moves causes a "fail low".
As you see the subtility of the algorithm is the choice of the gamma values. As an example if I expect that the value of the initial position is +150 then I can choose for gamma a value a little smaller, let's say +140, in order to increase the chance to have only one "fail high".
Do I have answer you question?

BTW almost all alpha-beta programmers use a kind of mix of pure alpha-beta and MTD(f) because they use aspiration windows which looks like a MTD(f) procedure doesn't it?
Gérard
jj
Posts: 190
Joined: Sun Sep 13, 2009 23:33
Real name: Jan-Jaap van Horssen
Location: Zeist, Netherlands

Re: Handling search inconsistencies in MTD(f)

Post by jj »

TAILLE wrote:Basically you continue running test until low == upp and the best move is a move which assures low value.
I see, so that part is equivalent to my solution (never use a fail-low value to select a move).

In textbook MTD(f) you can also have low > upp because of search inconsistencies. Does your implementation assure low == upp?
TAILLE wrote:However, for my point of view it is not a good idea to wait until low == upp.
In my understanding one of the great differences between alpha-beta and MTD(f) is the following : while alpha-beta seems better for evaluating a position, MTD(f) seems better for finding the best move.
Let's consider a test value gamma. Suppose that by testing the first move m1 you reach a "fail high". That means that the value of m1 is at least equal to gamma. You can of course conclude that the value of the intitial position is at least gamma and as a consequence you can increase the value of gamma in order to find a better value of the initial position; but you can also search for another move in order to look for a second "fail high" with the same test value gamma. If this second "fail high" does not exist then you know move m1 is the best move, even if gamma is far from the best value!
If you find a second "fail high" you know that you have to continue the process by increasing gamma. In order to try to be able to choose quickly between m1 and m2 you will choose the new gamma according to the two return values corresponding on the two "fail high".
As you see you continue until low == upp only if the best two moves have almost the same value. In all other cases you can expect to find the best move more quickly than alpha-beta.
In summary, in my implementation the best move is the move which causes a "fail high" against a test value gamma when all other moves causes a "fail low".
As you see the subtility of the algorithm is the choice of the gamma values. As an example if I expect that the value of the initial position is +150 then I can choose for gamma a value a little smaller, let's say +140, in order to increase the chance to have only one "fail high".
Do I have answer you question?
Yes, thanks for the explanation. Optimization of MTD(f) and pruning/extensions are next on my list. But first I wanted a correct basic version.
TAILLE wrote:BTW almost all alpha-beta programmers use a kind of mix of pure alpha-beta and MTD(f) because they use aspiration windows which looks like a MTD(f) procedure doesn't it?
Yes.
Post Reply