MTD-f procedure

TAILLE · Post by **TAILLE** » Sat Oct 10, 2009 03:27

Do you use an iterative search in the MTD-f procedure ?

Let’s take the following example to illustrate my question.

Assume that, in the initial position, you find that, at depth 5, the value of the move 32-28 is +0,020.
Always at depth 5, you will test move 34-30 against the value +0,020 and assume now that you find that 34-30 19-23! is a good answer because you find that the value of 34-30 19-23 gives white a value <= +0,020. As a consequence you will not analyze other black moves.
Suppose that the situation is the same at depth 6, 7, … 15 :
32-28 : value +0,020
34-30 19-23! : value <= +0,020

What will happen at depth 16 if we have
32-28 : value +0,020
34-30 19-23 : value > +0,020
Of course you will try another black move, let’s say 34-30 20-25, but here is the point : your hash table is not prepared to handle efficiently this move, directly at depth 16, because this move is a new move that has never been seen before. In this case do you use an iterative approach in order to know if the value of 34-30 20-25 is <= +0,020 or not ?

Ed Gilbert · Post by **Ed Gilbert** » Sat Oct 10, 2009 14:02

Do you use an iterative search in the MTD-f procedure ?

Gerard, yes I use what is commonly called internal iterative deepening by the chess guys. If at some internal node you don't have a hashtable move to try first, you do a shallow search to get one, hoping that the overhead of this extra search will be more than compensated for by the better move ordering. It has been a long time since I experimented with it but I think it decreased the node counts by around 10%.

Are you still using mtd-f in damy? I thought you were experimenting with a different type of search. Have you tried PVS? This seems to be the choice for most the chess guys, judging from what I read on the forums. I haven't tried it for draughts. My feeling is that it might be better suited for chess than draughts. In chess the average branching factor is much larger (~40?), but typically only a small number of those moves are playable. In draughts I think on average almost half the moves at any position are playable, which makes it more difficult to have a stable pv.

-- Ed

BertTuyt · Post by **BertTuyt** » Sat Oct 10, 2009 14:18

Ed/Gerard, I use PVS, as most Chess/Draughts programs/programmers do. To my knowledge, you are the only ones (within the Draughts community) using mtd-f in your programs.

Any idea to what extend this is better then PVS?
When playing against Kingsrow, I always had the impression that I was outsearched in the late middlegame. But so far I was not able to conclude if this was related to mtd-f, or better move-ordening and/or pruning.
Or granularity of the eval function ??

Also i don't know if there is a difference in scalability when going into parallel between PVS (with YBWC as I do) and mtd-f.

Bert

TAILLE · Post by **TAILLE** » Sat Oct 10, 2009 15:38

Ed Gilbert wrote: Are you still using mtd-f in damy? I thought you were experimenting with a different type of search. Have you tried PVS? This seems to be the choice for most the chess guys, judging from what I read on the forums. I haven't tried it for draughts. My feeling is that it might be better suited for chess than draughts. In chess the average branching factor is much larger (~40?), but typically only a small number of those moves are playable. In draughts I think on average almost half the moves at any position are playable, which makes it more difficult to have a stable pv.
-- Ed

My algorithm is clearly based on mtd-f. I introduced only one major change in order to have better cutoffs : starting from the initial position it might happen that after 32-28 18-23 37-32 the move 12-18 appears as a cutoff. As we all know this is of course a very poor cutoff comparing to 23-29! For that reason I change some "details" of the mtd-f procedure.
I never tried PVS but I also have the feeling that mtd-f is better in draughts (but I am not sure).
For parallel search, in order to be able to use 8 threads, I finally decided to not share the hashtable. As a consequence the hastable access is much simplier and efficient but I cannot use YBWC because this procedure works efficiently only with a share hashtable.
My feeling is that a share hashtable is the best approach with one, two or maybe 4 threads. With more threads private hashtable sounds better.

Ed Gilbert · Post by **Ed Gilbert** » Sat Oct 10, 2009 17:44

Any idea to what extend this is better then PVS?

No idea, and I don't even know that mtd is better, it's pure speculation for me. If you have a good pvs search working then there is probably no reason to change it.

-- Ed

TAILLE · Post by **TAILLE** » Sat Oct 10, 2009 18:59

BertTuyt wrote:Ed/Gerard, I use PVS, as most Chess/Draughts programs/programmers do. To my knowledge, you are the only ones (within the Draughts community) using mtd-f in your programs.

Any idea to what extend this is better then PVS?
When playing against Kingsrow, I always had the impression that I was outsearched in the late middlegame. But so far I was not able to conclude if this was related to mtd-f, or better move-ordening and/or pruning.
Or granularity of the eval function ??

Also i don't know if there is a difference in scalability when going into parallel between PVS (with YBWC as I do) and mtd-f.

Bert

Why I chose MTD-f rather than PVS ?
The three major points I took into account are the following
1) If the position is closed I need to explore the tree as deep as possible in order to reach the point where an exchange will be necessary, and in order to be able to evaluate the consequence of this exchange. The point is the following : in closed positions the number of good moves are very low and it is not easy to discover them
2) In the other hand, in open positions, the number of good moves increases and I do not need to go very deep in the tree to find a good move
3) The PVS algorithm is particularly good if I can guess the good move at all each depth of the tree. If is not the case this algorithm may not be as good as MTD-f algorithm, especially when it appears that the initial position is quite even because in that case the first test-value used by the MTD-f procedure will be quite good.

Now you understand my choice : the positions where we need a lot of calculations are the closed positions. As a consequence I chose the algorithm which is particularly efficient in theses positions. Because the good moves are difficult to find in these positions PVS will be not very efficient, and in addition, because in most practical cases the initial positions are quite even when each player plays the good moves, MTD-f will be very efficient.

The chess game is very different. Most of chess positions are quite similar to our open positions and, in this case PVS may be the good choice.

I know it is not a proof; it is only my feeling.

Gérard

Rein Halbersma · Post by **Rein Halbersma** » Sat Oct 10, 2009 19:10

TAILLE wrote: My algorithm is clearly based on mtd-f. I introduced only one major change in order to have better cutoffs : starting from the initial position it might happen that after 32-28 18-23 37-32 the move 12-18 appears as a cutoff. As we all know this is of course a very poor cutoff comparing to 23-29! For that reason I change some "details" of the mtd-f procedure.

How do you identify whether or not there is a cheaper cutoff?

Rein Halbersma · Post by **Rein Halbersma** » Sat Oct 10, 2009 19:18

TAILLE wrote: Why I chose MTD-f rather than PVS ?
The three major points I took into account are the following
1) If the position is closed I need to explore the tree as deep as possible in order to reach the point where an exchange will be necessary, and in order to be able to evaluate the consequence of this exchange. The point is the following : in closed positions the number of good moves are very low and it is not easy to discover them
2) In the other hand, in open positions, the number of good moves increases and I do not need to go very deep in the tree to find a good move
3) The PVS algorithm is particularly good if I can guess the good move at all each depth of the tree. If is not the case this algorithm may not be as good as MTD-f algorithm, especially when it appears that the initial position is quite even because in that case the first test-value used by the MTD-f procedure will be quite good.

Now you understand my choice : the positions where we need a lot of calculations are the closed positions. As a consequence I chose the algorithm which is particularly efficient in theses positions. Because the good moves are difficult to find in these positions PVS will be not very efficient, and in addition, because in most practical cases the initial positions are quite even when each player plays the good moves, MTD-f will be very efficient.

The chess game is very different. Most of chess positions are quite similar to our open positions and, in this case PVS may be the good choice.

I know it is not a proof; it is only my feeling.

Gérard

If you use PVS with an Aspiration window (i.e. very small window around the value from the previous iteration) for the pv and with null windows for all other moves, there is not much difference with MTD(f), especially if you use fail-soft.

The only difference is when MTD(f) fails at the root it researches with a null-window at the previous fail-soft score, whereas PVS will widen the Aspiration window just enough to contain the fail-soft score. But in principle you could also shift the Aspiration window to wrap around the fail-soft score (rather than just widen it). In that case, it is almost identical to MTD(f).

For me, PVS with all these enhancements sounds more attractive because it's easier to get a good pv from the hash tables. I don't think there will be much difference in the number of nodes searched.

TAILLE · Post by **TAILLE** » Sun Oct 11, 2009 11:13

Rein Halbersma wrote:
TAILLE wrote: Why I chose MTD-f rather than PVS ?
The three major points I took into account are the following
1) If the position is closed I need to explore the tree as deep as possible in order to reach the point where an exchange will be necessary, and in order to be able to evaluate the consequence of this exchange. The point is the following : in closed positions the number of good moves are very low and it is not easy to discover them
2) In the other hand, in open positions, the number of good moves increases and I do not need to go very deep in the tree to find a good move
3) The PVS algorithm is particularly good if I can guess the good move at all each depth of the tree. If is not the case this algorithm may not be as good as MTD-f algorithm, especially when it appears that the initial position is quite even because in that case the first test-value used by the MTD-f procedure will be quite good.

Now you understand my choice : the positions where we need a lot of calculations are the closed positions. As a consequence I chose the algorithm which is particularly efficient in theses positions. Because the good moves are difficult to find in these positions PVS will be not very efficient, and in addition, because in most practical cases the initial positions are quite even when each player plays the good moves, MTD-f will be very efficient.

The chess game is very different. Most of chess positions are quite similar to our open positions and, in this case PVS may be the good choice.

I know it is not a proof; it is only my feeling.

Gérard
If you use PVS with an Aspiration window (i.e. very small window around the value from the previous iteration) for the pv and with null windows for all other moves, there is not much difference with MTD(f), especially if you use fail-soft.

The only difference is when MTD(f) fails at the root it researches with a null-window at the previous fail-soft score, whereas PVS will widen the Aspiration window just enough to contain the fail-soft score. But in principle you could also shift the Aspiration window to wrap around the fail-soft score (rather than just widen it). In that case, it is almost identical to MTD(f).

For me, PVS with all these enhancements sounds more attractive because it's easier to get a good pv from the hash tables. I don't think there will be much difference in the number of nodes searched.

Hi Rein,

Two questions :
1) Why do you think that, when using MTD-f, the pv calculated from the hashtable is not that good ?
2) How you handle my initial question in the PVS context ? (32-28 is part of the PV and evaluated to +0,20 for a long time; 34-30 19-23 is evaluated to a value <= +,0,20 (cutoff) and this also for a lot of depths; and suddenly you discover that 34-30 19-23 is > +0,20. Your hashtable is not prepared to handle this point, is it ?

Rein Halbersma · Post by **Rein Halbersma** » Sun Oct 11, 2009 11:42

TAILLE wrote: Hi Rein,

Two questions :
1) Why do you think that, when using MTD-f, the pv calculated from the hashtable is not that good ?
2) How you handle my initial question in the PVS context ? (32-28 is part of the PV and evaluated to +0,20 for a long time; 34-30 19-23 is evaluated to a value <= +,0,20 (cutoff) and this also for a lot of depths; and suddenly you discover that 34-30 19-23 is > +0,20. Your hashtable is not prepared to handle this point, is it ?

1) because you only have upper bounds or lower bounds you never have exact bounds. So either you have to keep both bounds in the hash table, but then the size of the hash entries quickly becomes more than 64 bits, or you do PVS.
2) as Ed suggested, you can use iterative deepening (and killers, history etc.) at all internal nodes where the hash move fails low.

What about my two previous questions?
a) what information do you store in your ghost bits?
b) how do you identify that after e.g. 32-28 18-32 37-32 moves as 23-29 are better cutoffs then 12-18?

TAILLE · Post by **TAILLE** » Sun Oct 11, 2009 12:34

Hi Rein,

Rein Halbersma wrote: 1) because you only have upper bounds or lower bounds you never have exact bounds. So either you have to keep both bounds in the hash table, but then the size of the hash entries quickly becomes more than 64 bits, or you do PVS.

I use 3x64bits for an entry of my hashtable and I store the upper bounds or the lower bounds.

Rein Halbersma wrote: a) what information do you store in your ghost bits?

I do not use the ghost bits in the move generator. I use it only in the hashtable : I store the best move in the same format (64bits) and here, there are no harm to put other information in order to compress a little the hashtable.

Rein Halbersma wrote: b) how do you identify that after e.g. 32-28 18-32 37-32 moves as 23-29 are better cutoffs then 12-18?

That is an interesting point.
Let's assume that I am testing the value +0,040. When I reach the position 32-28 18-23 37-32, if I am still in the high part of the tree, I run a small search (at depth 3) against a value like +0,040 - 0,500 = -0,460 in order to find a strong cutoff. That way I could find the move 23-29! which will be used in the next iteration as the best move. In order to limit the corresponding overhead I store in the hash table (on 1 bit) that this small search has already been done for this position; this way this small search will not occur for the next iterations.
As you see it is quite easy in the MTD-f context.
It is certainly possible to do quite the same in a PVS context but maybe it is a little more difficullt because you do not see clearly what could be the value of the initial position.