Basically you continue running test until low == upp and the best move is a move which assures low value.jj wrote:You are describing the process of choosing gamma. I'm asking what happens when you completed the last pass (edit: when low >= upp). If the last pass failed low, which move do you return as best move?TAILLE wrote:In this scenario after having run rootMT(root, moves, g(k-2),d) I obtained low = g(k-2) and upp = +INFjj wrote:So how do you select the best move if the last pass fails low? Or do you re-search until you get a fail high again?
then after having run rootMT(root, moves, g(k-1), d) I obtained low = g(k-2) and upp = g(k-1);
At this stage the logical continuation is to run the test : rootMT(root, moves, (g(k-1) + g(k-2))/2, d).
However, for my point of view it is not a good idea to wait until low == upp.
In my understanding one of the great differences between alpha-beta and MTD(f) is the following : while alpha-beta seems better for evaluating a position, MTD(f) seems better for finding the best move.
Let's consider a test value gamma. Suppose that by testing the first move m1 you reach a "fail high". That means that the value of m1 is at least equal to gamma. You can of course conclude that the value of the intitial position is at least gamma and as a consequence you can increase the value of gamma in order to find a better value of the initial position; but you can also search for another move in order to look for a second "fail high" with the same test value gamma. If this second "fail high" does not exist then you know move m1 is the best move, even if gamma is far from the best value!
If you find a second "fail high" you know that you have to continue the process by increasing gamma. In order to try to be able to choose quickly between m1 and m2 you will choose the new gamma according to the two return values corresponding on the two "fail high".
As you see you continue until low == upp only if the best two moves have almost the same value. In all other cases you can expect to find the best move more quickly than alpha-beta.
In summary, in my implementation the best move is the move which causes a "fail high" against a test value gamma when all other moves causes a "fail low".
As you see the subtility of the algorithm is the choice of the gamma values. As an example if I expect that the value of the initial position is +150 then I can choose for gamma a value a little smaller, let's say +140, in order to increase the chance to have only one "fail high".
Do I have answer you question?
BTW almost all alpha-beta programmers use a kind of mix of pure alpha-beta and MTD(f) because they use aspiration windows which looks like a MTD(f) procedure doesn't it?