Search Algorithm

Discussion about development of draughts in the time of computer and Internet.
Post Reply
Joost Buijs
Posts: 471
Joined: Wed May 04, 2016 11:45
Real name: Joost Buijs

Re: Search Algorithm

Post by Joost Buijs » Mon Oct 10, 2016 12:18

Fabien Letouzey wrote:
TAILLE wrote:It seems you consider a large egdb (7 or 8p db) not that worthy which is very surprising for me
TAILLE wrote: My conclusion: I encourage Fabien to not developpe the 8p db otherwise the other programs will have no chance to win a tournament
Fabien Letouzey wrote: I know you're joking but in all fairness, the current development versions of Kingsrow and Dragon are probably just as strong as Scan, if not better. I know that Kingsrow made enormous progress (was it 62% win?) last year by applying evaluation learning similar to Scan and Michel found several bugs in Dragon, which should be enough by itself. Not to mention that Joost is able to make something of comparable knowledge 10x faster than Scan.

And who knows what Bert and Harm and others will come up with? People seem motivated and for me, that is Scan's best victory that cannot be taken away.
Fabien,

For sure, the good results of Scan are the main reason that I switched back to Draughts after such a long time, kudos for that!
It is a pity though that there is very little activity in this part of the forum, sometimes you don't see a single post in a week.

That my program runs 10 times faster than Scan is strongly exaggerated, it has more to do with hardware, Scan runs on my computer ~57 mnps from the initial position, at max. I can double this number for my program by using PGO and large-pages for the hash-table. I compiled Scan with MSVC, it will run faster when I build it with the Intel compiler and PGO.
The more I add, the slower my program gets, and I wouldn't be surprised that in the end both programs will be of comparable speed.

A couple of weeks ago I converted from Windows threads to C++11 threads/locks, this works fine, I have the impression that it (unexpectedly) runs a tiny bit faster, after I examined it with the Intel thread-debugger I noticed that there are data-races in the MSVC C++11 thread-library itself so I decided to switch back.
I use several condition-variables with std::unique_lock, I have the impression that this is not very mature in the MSVC library yet.
You are using Linux, I assume that the GCC C++11 thread-library doesn't have this problem.

Joost

TAILLE
Posts: 968
Joined: Thu Apr 26, 2007 18:51
Location: FRANCE

Re: Search Algorithm

Post by TAILLE » Mon Oct 10, 2016 15:41

Joost Buijs wrote:
TAILLE wrote:
Joost Buijs wrote:
Damy must be very smart that it can see the absolute win in this position.
Indeed after 24-30 Black wins a man in 28 ply and maybe this is winning, after 36 ply with 8p egdb from Ed enabled my program still doesn't see more than 1 man advantage.
Hi Joost,

One of the most impressive variant is the following:
1...24-30 2.48-42 30x39 3.33x44 13-19 4.42-37 12-17 5.28x17 21x12 6.32-28 18-23 7.28-22 23-29 8.37-31 29-34 9.22-18 12x23 10.27-21 11.31x22 23-28 12.22x33 34-39 13.44-40 39x28 14.26-21 8-12 15.36-31 25-30 16.31-27 19-24 17.40-35 30-34 18.21-16 28-33 19.27-21 33-39 20.16-11 39-43 21.11-6 43-48 22.21-16 24-29 23.16-11 48-26 24.6-1 26-48 25.1x18 N+ (8p db)
Yes it is impressive, but it takes a long time to proof that there are no hidden refutations.

My program (without egtb) finds the same winning move in 4 seconds with the variation below, of course it only sees that it wins a disc, and I have to add that with egdb enabled it takes much longer to find this win:

24-30 48-42 30x39 33x44 25-30 42-37 17-22 28x17 21x12 32-28 18-23 28x19 13x24 37-31 24-29 27-21 16x27 31x22 29-33 22-17 12x21 26x17 33-38 17-11 38-43 11-6 43-49

Maybe the last moves in the PV are a bit uncertain because I distill them from the hash-table and don't look at the bounds.
So even without egdb there is a big chance that a program is able to win this, that Scan didn't see this has probably to do with the very short time-controls.

Joost
Joost,

I just do not know what can I answer to your post
After 24-30 48-42 30x39 33x44 25-30 42-37 17-22 28x17 21x12 32-28 18-23 28x19 13x24 37-31 24-29 27-21 16x27 31x22 29-33 22-17 12x21 26x17 you enter in the 8p egdb with a draw result and Damy proves that after 24-30 48-42 30x39 33x44 25-30? (instead of 13-19!) it is a draw.
We all know that in difficult positions, where at least two moves seems promising (13-19 and 24-30 in my example), we need a deeper analysis in order to discover the winning one. I quite understand that, by working on the evaluation function, you can be able to make the correct choice. For me it sounded simply easier and more reliable to use a large egdb and to improve my search algorithm.
BTW before this work have done on the endgame let me you know that for my first competition, I worked for years essentially on building on evaluation function with a standard alpha-beta search. It is just obvious isn't it?
The problem now is I that feel my evaluation function is now too complex and I feel difficulties to improve it.
As a consequence I am working on building a totally new eval function and I guess I have work for several months.
In addition I have to test my new algorithm for the middle game and I suspect I have to tune some parameters according to the phase of the game (endgame or middlegame).


,
Gérard

Joost Buijs
Posts: 471
Joined: Wed May 04, 2016 11:45
Real name: Joost Buijs

Re: Search Algorithm

Post by Joost Buijs » Mon Oct 10, 2016 16:24

TAILLE wrote:
Joost Buijs wrote:
TAILLE wrote:
Hi Joost,

One of the most impressive variant is the following:
1...24-30 2.48-42 30x39 3.33x44 13-19 4.42-37 12-17 5.28x17 21x12 6.32-28 18-23 7.28-22 23-29 8.37-31 29-34 9.22-18 12x23 10.27-21 11.31x22 23-28 12.22x33 34-39 13.44-40 39x28 14.26-21 8-12 15.36-31 25-30 16.31-27 19-24 17.40-35 30-34 18.21-16 28-33 19.27-21 33-39 20.16-11 39-43 21.11-6 43-48 22.21-16 24-29 23.16-11 48-26 24.6-1 26-48 25.1x18 N+ (8p db)
Yes it is impressive, but it takes a long time to proof that there are no hidden refutations.

My program (without egtb) finds the same winning move in 4 seconds with the variation below, of course it only sees that it wins a disc, and I have to add that with egdb enabled it takes much longer to find this win:

24-30 48-42 30x39 33x44 25-30 42-37 17-22 28x17 21x12 32-28 18-23 28x19 13x24 37-31 24-29 27-21 16x27 31x22 29-33 22-17 12x21 26x17 33-38 17-11 38-43 11-6 43-49

Maybe the last moves in the PV are a bit uncertain because I distill them from the hash-table and don't look at the bounds.
So even without egdb there is a big chance that a program is able to win this, that Scan didn't see this has probably to do with the very short time-controls.

Joost
Joost,

I just do not know what can I answer to your post
After 24-30 48-42 30x39 33x44 25-30 42-37 17-22 28x17 21x12 32-28 18-23 28x19 13x24 37-31 24-29 27-21 16x27 31x22 29-33 22-17 12x21 26x17 you enter in the 8p egdb with a draw result and Damy proves that after 24-30 48-42 30x39 33x44 25-30? (instead of 13-19!) it is a draw.
We all know that in difficult positions, where at least two moves seems promising (13-19 and 24-30 in my example), we need a deeper analysis in order to discover the winning one. I quite understand that, by working on the evaluation function, you can be able to make the correct choice. For me it sounded simply easier and more reliable to use a large egdb and to improve my search algorithm.
BTW before this work have done on the endgame let me you know that for my first competition, I worked for years essentially on building on evaluation function with a standard alpha-beta search. It is just obvious isn't it?
The problem now is I that feel my evaluation function is now too complex and I feel difficulties to improve it.
As a consequence I am working on building a totally new eval function and I guess I have work for several months.
In addition I have to test my new algorithm for the middle game and I suspect I have to tune some parameters according to the phase of the game (endgame or middlegame).
Gerard,

Both moves (13-19 and 24-30) win a piece at depth 28 this is all I can tell.
I have no idea how your algorithm works, you show a line of 48 ply that ends up in a won egtb position, in my opinion this is no real proof.
How can you be certain that this line is forced and that you don't miss a refutation and that there is not a hidden bug somewhere?
At least somebody else has to verify and confirm your claims before you can speak of proof, this is the way I see it.

Joost

TAILLE
Posts: 968
Joined: Thu Apr 26, 2007 18:51
Location: FRANCE

Re: Search Algorithm

Post by TAILLE » Mon Oct 10, 2016 17:25

Joost Buijs wrote:
TAILLE wrote:
Joost Buijs wrote:
Yes it is impressive, but it takes a long time to proof that there are no hidden refutations.

My program (without egtb) finds the same winning move in 4 seconds with the variation below, of course it only sees that it wins a disc, and I have to add that with egdb enabled it takes much longer to find this win:

24-30 48-42 30x39 33x44 25-30 42-37 17-22 28x17 21x12 32-28 18-23 28x19 13x24 37-31 24-29 27-21 16x27 31x22 29-33 22-17 12x21 26x17 33-38 17-11 38-43 11-6 43-49

Maybe the last moves in the PV are a bit uncertain because I distill them from the hash-table and don't look at the bounds.
So even without egdb there is a big chance that a program is able to win this, that Scan didn't see this has probably to do with the very short time-controls.

Joost
Joost,

I just do not know what can I answer to your post
After 24-30 48-42 30x39 33x44 25-30 42-37 17-22 28x17 21x12 32-28 18-23 28x19 13x24 37-31 24-29 27-21 16x27 31x22 29-33 22-17 12x21 26x17 you enter in the 8p egdb with a draw result and Damy proves that after 24-30 48-42 30x39 33x44 25-30? (instead of 13-19!) it is a draw.
We all know that in difficult positions, where at least two moves seems promising (13-19 and 24-30 in my example), we need a deeper analysis in order to discover the winning one. I quite understand that, by working on the evaluation function, you can be able to make the correct choice. For me it sounded simply easier and more reliable to use a large egdb and to improve my search algorithm.
BTW before this work have done on the endgame let me you know that for my first competition, I worked for years essentially on building on evaluation function with a standard alpha-beta search. It is just obvious isn't it?
The problem now is I that feel my evaluation function is now too complex and I feel difficulties to improve it.
As a consequence I am working on building a totally new eval function and I guess I have work for several months.
In addition I have to test my new algorithm for the middle game and I suspect I have to tune some parameters according to the phase of the game (endgame or middlegame).
Gerard,

Both moves (13-19 and 24-30) win a piece at depth 28 this is all I can tell.
I have no idea how your algorithm works, you show a line that ends up in a won egtb position but in my opinion this is no real proof.
How can you be certain that this line is forced and that you don't miss a refutation and that there is not a hidden bug somewhere?
At least somebody else has to verify your claims before you can speak of proof, this is the way I see it.

Joost
Joost,

I see perfectly your point. You can never trust the result of a program except of course if you are very familiar with this program. As soon as the position becomes a little difficult a human is unable to see all the subtree built by the program to prove the win. It is exactly the same with the egdb. How can you trust the result given by a db? You always can think a bug may exist in the program don't you?
A perfect example is the following position, well known by those who generated the 8p db

Image
White to play

Because Kingsrow, Dragon and Damy all claim here for the win nobody imagine it could not be correct. But if you, as human, would like to be really convinced I have to give up immediatly because I know it is completly impossible for a human to understand why it is a sure win.
It is exactly the same with the position analysed here. Because I trust Damy I am sure 24-30 is a winning move but I cannot convinced a human. If, with the help of other programs, you do not see any error in the sequence I showed, obviously it could not be a proof, but it is a good start to trust Damy isn't it?
BTW do you always trust Kingsrow when it claims a win?
Gérard

Rein Halbersma
Posts: 1722
Joined: Wed Apr 14, 2004 16:04
Contact:

Re: Search Algorithm

Post by Rein Halbersma » Mon Oct 10, 2016 19:30

TAILLE wrote: It is rather difficult to give you all the information avalilable but at least I will do my best

after each move I write the evaluation and in parenthesis a value which is representative of the time allowed to the analyse of this move.

After 4" the result is the following
13-19 +50 (14 356)
24-30 +0 (8 104)
17-22 +0 (7 958)
25-30 -100 (151)
8-12 -100 (165)
4-9 -200 (150)
18-23 -440 (149)
etc.

After 8" the result becomes
24-30 +83 (50 074)
13-19 +35 (40 668)
17-22 +0 (36 315)
8-12 -100 (524)
4-9 -240 (498)
etc.

After 20"
24-30 +108 (133 400)
13-19 +35 (55 320)
17-22 +0 (49 240)
8-12 -145 (1 501)
25-30 -200 (1 336)
etc

After 1'
24-30 +150 (504 248)
13-19 +25 (64 198)
17-22 +0 (49 240)
25-30 -260 (4 998)
8-12 -340 (4 992)
etc

The complete win is find after 25'.

As you can see, between time 20" and time 1' Damy used more than 95% of it's time on the 24-30 move. Clearly it is for me the secret of the success.

Rein, does it answer your question?
Thanks, yes that's insightful! So basically you have a priority queue at the root, where the number between parentheses is the weighted priority?

Rein Halbersma
Posts: 1722
Joined: Wed Apr 14, 2004 16:04
Contact:

Re: Search Algorithm

Post by Rein Halbersma » Mon Oct 10, 2016 19:43

Fabien Letouzey wrote: I agree with your view on loops and criterions, but I'm not sure whether it's easier overall. Instead I would say that it's more explicit: what we are maximising is written down somewhere. Indeed that makes experimentation more modular (single place for each concept).

The reason why I think it's not easy is that the formula to pick the next leaf to expand is ideally "global" (maximised over the whole set of leaves). That being said, some best-first search implementations use a greedy search to find a leaf: pick the most promising child at every level. But then, it becomes unclear what is really optimised.

By contrast, when we work on alpha-beta we think at the level of a single node; at least that's what I do. That simplifies reasoning at the price of losing the big picture, which is to improve the "quality" of the root move/score "efficiently" (for some definition of those terms). For example, who has a clear model of what ID + AB + LMR is doing to the tree? I don't, but it looks a lot like best-first search. However the implementation is scattered, like in OOP.

In my study of BFS, standard concepts like "searching an internal node" or "window" disappeared and were replaced by questions like: "which leaf seems the most promising to expand?", "what kind of value(s) do I back up", and sometimes "how confident am I of this node/leaf's score?". For example, bad-looking (locally inferior) moves are given a lower exploration score. This is the "continuous" analogue to alpha-beta which is a clear-cut yes/no.

I think that what you are looking for is doable but when you actually experiment with BFS, you might not even want to do it the way you propose anymore. In other words, I am suggesting that "searching an internal node" and "window" are artefacts of our thinking in terms of depth-first search. Regardless, the journey (even only reading a couple of articles) should bring some enlightement.

Fabien.
I agree with all of what you wrote. Did you also study how AlphaGo works? That was quite insightful, how they combine MCTS with cheap as well as expensive evaluation functions, one for internal node expansion and roll-outs, and one for terminal evaluation. I don't know if the tactical nature of draughts would work with it, but it is certainly an interesting approach.

Rein Halbersma
Posts: 1722
Joined: Wed Apr 14, 2004 16:04
Contact:

Re: Search Algorithm

Post by Rein Halbersma » Mon Oct 10, 2016 19:57

TAILLE wrote: Because Kingsrow, Dragon and Damy all claim here for the win nobody imagine it could not be correct. But if you, as human, would like to be really convinced I have to give up immediatly because I know it is completly impossible for a human to understand why it is a sure win.
It is exactly the same with the position analysed here. Because I trust Damy I am sure 24-30 is a winning move but I cannot convinced a human. If, with the help of other programs, you do not see any error in the sequence I showed, obviously it could not be a proof, but it is a good start to trust Damy isn't it?
BTW do you always trust Kingsrow when it claims a win?
There is a big difference: endgame databases are very well described in the literature, they are theoretically sound, and also the practical implementation details of building, compressing and using them are well documented. If 3 separate implementations of the endgame database concept come to identical results (in terms of W/L/D counts), then of course we can have full confidence in their correctness. So yes, apart from GHI issues that Kingsrow does not solve, I have full confidence in a reported win by Kingsrow, but not for draw scores.

For your algorithm, things are different. You are the first to report such spectacular time savings. There is no binary application to play with and see the PV for different positions yourself. There is also not a formal description (as in a scientific journal article, or even an informal blog posting) of how exactly the algorithm works, let alone a (pseudo-)code sketch of how it is supposed to work in practice.

I have absolutely no reason to doubt your claims and integrity, so don't take this as an accusation, but I have to agree with Joost. I would need a lot more "evidence" about how your algorithm works (detailed description, pseudo-code etc.) before I can really be confident in that Damy variations are actually correct in the cases that Kingsrow is not able to verify them.

TAILLE
Posts: 968
Joined: Thu Apr 26, 2007 18:51
Location: FRANCE

Re: Search Algorithm

Post by TAILLE » Mon Oct 10, 2016 22:33

Joost Buijs wrote:
Fabien Letouzey wrote:
Joost Buijs wrote:So even without egdb there is a big chance that a program is able to win this, that Scan didn't see this has probably to do with the very short time-controls.
Actually Scan claims that 13-19 also wins a piece (or a huge positional advantage), and for some reason it was easier to prove.
Fabien,

Indeed you are right, 13-19 also gains material, when I exclude 24-30 my program immediately sees 13-19.
There is a big chance that this position is not won after all (despite what Damy thinks), after a 36 ply search + quiescence with 8p egtb probes at the leaves my program still doesn't see an absolute win.
My search is still without pruning, probably others can look much further ahead, I don't want to spend more time on analyzing this position, maybe somebody else feels the need to do this.

Joost
Joost,

I perfectly understand you could have doubt concerning the result given by Damy and it sounds normal! But if you don't try to propose a refutation how can I defend Damy analysis?
I made a big step by giving all my PV (a lot of stuff isn't it?) and I will be happy to answer all your remaining doubts. Who has other white moves to propose as potential refutation?
Gérard

TAILLE
Posts: 968
Joined: Thu Apr 26, 2007 18:51
Location: FRANCE

Re: Search Algorithm

Post by TAILLE » Mon Oct 10, 2016 22:54

Rein Halbersma wrote:
TAILLE wrote: Because Kingsrow, Dragon and Damy all claim here for the win nobody imagine it could not be correct. But if you, as human, would like to be really convinced I have to give up immediatly because I know it is completly impossible for a human to understand why it is a sure win.
It is exactly the same with the position analysed here. Because I trust Damy I am sure 24-30 is a winning move but I cannot convinced a human. If, with the help of other programs, you do not see any error in the sequence I showed, obviously it could not be a proof, but it is a good start to trust Damy isn't it?
BTW do you always trust Kingsrow when it claims a win?
There is a big difference: endgame databases are very well described in the literature, they are theoretically sound, and also the practical implementation details of building, compressing and using them are well documented. If 3 separate implementations of the endgame database concept come to identical results (in terms of W/L/D counts), then of course we can have full confidence in their correctness. So yes, apart from GHI issues that Kingsrow does not solve, I have full confidence in a reported win by Kingsrow, but not for draw scores.

For your algorithm, things are different. You are the first to report such spectacular time savings. There is no binary application to play with and see the PV for different positions yourself. There is also not a formal description (as in a scientific journal article, or even an informal blog posting) of how exactly the algorithm works, let alone a (pseudo-)code sketch of how it is supposed to work in practice.

I have absolutely no reason to doubt your claims and integrity, so don't take this as an accusation, but I have to agree with Joost. I would need a lot more "evidence" about how your algorithm works (detailed description, pseudo-code etc.) before I can really be confident in that Damy variations are actually correct in the cases that Kingsrow is not able to verify them.
OK Rein I understand your point.
I put in my TODO list that I have to write an article to explain my developpements but I am afraid that this kind of job will take me a lot of time! My current priority is working on a new eval function which looks funny job for me (!?).
In any case I can do my best to answer to basic questions.
Gérard

TAILLE
Posts: 968
Joined: Thu Apr 26, 2007 18:51
Location: FRANCE

Re: Search Algorithm

Post by TAILLE » Mon Oct 10, 2016 22:59

Rein Halbersma wrote:
TAILLE wrote: It is rather difficult to give you all the information avalilable but at least I will do my best

after each move I write the evaluation and in parenthesis a value which is representative of the time allowed to the analyse of this move.

After 4" the result is the following
13-19 +50 (14 356)
24-30 +0 (8 104)
17-22 +0 (7 958)
25-30 -100 (151)
8-12 -100 (165)
4-9 -200 (150)
18-23 -440 (149)
etc.

After 8" the result becomes
24-30 +83 (50 074)
13-19 +35 (40 668)
17-22 +0 (36 315)
8-12 -100 (524)
4-9 -240 (498)
etc.

After 20"
24-30 +108 (133 400)
13-19 +35 (55 320)
17-22 +0 (49 240)
8-12 -145 (1 501)
25-30 -200 (1 336)
etc

After 1'
24-30 +150 (504 248)
13-19 +25 (64 198)
17-22 +0 (49 240)
25-30 -260 (4 998)
8-12 -340 (4 992)
etc

The complete win is find after 25'.

As you can see, between time 20" and time 1' Damy used more than 95% of it's time on the 24-30 move. Clearly it is for me the secret of the success.

Rein, does it answer your question?
Thanks, yes that's insightful! So basically you have a priority queue at the root, where the number between parentheses is the weighted priority?
No Rein the number in parenthesis could not be seen as a priority. It is only a reprensation of the time used on this move during all the previous iterations.
I put it for you in order you can see the repartition of Damy effort on the different moves.
Gérard

Joost Buijs
Posts: 471
Joined: Wed May 04, 2016 11:45
Real name: Joost Buijs

Re: Search Algorithm

Post by Joost Buijs » Tue Oct 11, 2016 07:24

TAILLE wrote:
Joost,

I see perfectly your point. You can never trust the result of a program except of course if you are very familiar with this program. As soon as the position becomes a little difficult a human is unable to see all the subtree built by the program to prove the win. It is exactly the same with the egdb. How can you trust the result given by a db? You always can think a bug may exist in the program don't you?
A perfect example is the following position, well known by those who generated the 8p db

Image
White to play

Because Kingsrow, Dragon and Damy all claim here for the win nobody imagine it could not be correct. But if you, as human, would like to be really convinced I have to give up immediatly because I know it is completly impossible for a human to understand why it is a sure win.
It is exactly the same with the position analysed here. Because I trust Damy I am sure 24-30 is a winning move but I cannot convinced a human. If, with the help of other programs, you do not see any error in the sequence I showed, obviously it could not be a proof, but it is a good start to trust Damy isn't it?
BTW do you always trust Kingsrow when it claims a win?
Gerard,

It is indeed virtually impossible for humans to keep track of all the variations and to understand fully what is happening on the board.

When a (very selective) program shows me a variation of 48 ply deep and tells me that this variation is a sure win I don't believe that without any additional proof. Like said I have no idea how your algorithm works, maybe you are back-tracking or using an algorithm that resembles what IDeA for chess does.

I never looked at the programs you mention above, I only know them by name, no matter what, I never trust a program for 100% not even my own.

Joost

Fabien Letouzey
Posts: 299
Joined: Tue Jul 07, 2015 07:48
Real name: Fabien Letouzey

Re: Search Algorithm

Post by Fabien Letouzey » Tue Oct 11, 2016 07:32

Rein Halbersma wrote:I agree with all of what you wrote. Did you also study how AlphaGo works? That was quite insightful, how they combine MCTS with cheap as well as expensive evaluation functions, one for internal node expansion and roll-outs, and one for terminal evaluation. I don't know if the tactical nature of draughts would work with it, but it is certainly an interesting approach.
I'm a Go player and followed the big match. I only had a quick look at AlphaGo (AG), but worked with Rémi Coulom on policy networks.

I'm not sure what you mean by cheap evaluation; I only know of a single evaluation in AG, the value network, and it's very heavy. Let's assume that you are talking about the fast heuristics used in the simulations. Yes, there is a gap between the tree stored in memory and the simulations since the beginning of MCTS; so it's not specific to AG. The alpha-beta equivalent of this separation is a heuristic that is applied only when depth >= n to save NPS. In go, one uses fast tricks in the simulations, but can afford to compute things more seriously inside the tree. The connection between the two is that a node is expanded (in the sense of best-first search) when it has been visited say 20 times.

About the policy network, which is also not an AG novelty. I prefer to call it "move evaluation" as opposed to the "position evaluation" (AG's value network). It's a neural network that scores moves in a position. It's relatively easy to do in Go (compared to other games) because moves involve a single "square", so the "move map" (array of scores) is organised as an image/matrix. Go programmers now use a policy network to bias exploration in the tree, and it gains hundreds of Elo points even compared to the previous state of the art: advanced patterns that were difficult to learn well. BTW it's also heavy; all neural networks are.

Now regarding potential application of this to other games, I think it has some. It could be used for instance to decide move reductions. We don't need to use neural networks, the (now) usual patterns should do just fine. We can restrict their use away from the horizon to limit the overhead.

I have a good feeling about the dual nature of move/position evaluation; there might be something there. In alpha-beta games maybe we are focusing too much on positions, and we should at least start thinking also in terms of move evaluation (or scoring). Human players (and LMR + history) are doing that, after all.

Back to AG. The main innovation is the value network, the first good eval in Go. It gains even more hundreds of Elo points which is amazing, and will no doubt be the focus of Go programmers for at least a few years. There is a catch, though: that eval is super slow and on top of that, GPUs want to compute in batches. So here's my quick view of AG: the whole design is centered on "what to do while waiting for the GPU?", and parallelism + simulations seem to be a good way to fill that need.

About MCTS. While MCTS was historically used in Go because of the (unique) absence of eval, it still seems to have an edge for games with high branching factor like Amazons. The sampling aspect of MC is probably the key part. Nonetheless I feel that MCTS was overhyped and I haven't heard a lot of success stories.

We are now witnessing an evolution of MCTS to incorporate eval. For example in Amazons they play a few moves in simulations then apply eval, instead of playing to the end. An interesting case for us is Breakthrough (https://en.m.wikipedia.org/wiki/Breakth ... oard_game)), that I find close to draughts in spirit. In that game, alpha-beta+patterns and MCTS both lead to top programs, which I find interesting.

Regarding tactics, you have a point. I don't know about MCTS, but best-first search can have problems with it: if a move is scored -1.xx it will be mostly ignored afterwards. In other words, BFS is highly selective: it searches both very short and very long lines. Of course, there is a parameter to tune that.

Fabien.

Fabien Letouzey
Posts: 299
Joined: Tue Jul 07, 2015 07:48
Real name: Fabien Letouzey

Re: Search Algorithm

Post by Fabien Letouzey » Tue Oct 11, 2016 07:43

Joost,
Joost Buijs wrote:When a (very selective) program shows me a variation of 48 ply deep and tells me that this variation is a sure win I don't believe that without any additional proof. Like said I have no idea how your algorithm works, maybe you are back-tracking or using an algorithm that resembles what IDeA for chess does.
IDeA is just best-first search (guess but I'm very confident)
DOE is just best-first search (this is for Bert, sorry to hijack your post for this)
Gérard's "own new algorithm" is just best-first search (guess; slightly less confident than for IDeA)
I sense a pattern here.

If you study BFS, all will become clear. That is more a "thought experiment", though; it's not going to make your current project stronger. I'm saying that because it's a very different world from depth-first search, so there is a learning curve.

Fabien.

Joost Buijs
Posts: 471
Joined: Wed May 04, 2016 11:45
Real name: Joost Buijs

Re: Search Algorithm

Post by Joost Buijs » Tue Oct 11, 2016 10:32

Fabien Letouzey wrote:Joost,
Joost Buijs wrote:When a (very selective) program shows me a variation of 48 ply deep and tells me that this variation is a sure win I don't believe that without any additional proof. Like said I have no idea how your algorithm works, maybe you are back-tracking or using an algorithm that resembles what IDeA for chess does.
IDeA is just best-first search (guess but I'm very confident)
DOE is just best-first search (this is for Bert, sorry to hijack your post for this)
Gérard's "own new algorithm" is just best-first search (guess; slightly less confident than for IDeA)
I sense a pattern here.

If you study BFS, all will become clear. That is more a "thought experiment", though; it's not going to make your current project stronger. I'm saying that because it's a very different world from depth-first search, so there is a learning curve.

Fabien.
Fabien,

It is not my intention to start with anything like BFS, although I never studied it in detail I'm slightly aware of what it can do.
For Draughts and Chess I will keep using good old DFS which seems to me more than appropriate for the task it has to handle.
I'm sure that BFS has it's uses too, using it to build an opening-library (like IDeA does for chess) comes to mind.

Joost

Rein Halbersma
Posts: 1722
Joined: Wed Apr 14, 2004 16:04
Contact:

Re: Search Algorithm

Post by Rein Halbersma » Tue Oct 11, 2016 12:57

Fabien Letouzey wrote: We are now witnessing an evolution of MCTS to incorporate eval. For example in Amazons they play a few moves in simulations then apply eval, instead of playing to the end. An interesting case for us is Breakthrough (https://en.m.wikipedia.org/wiki/Breakth ... oard_game)), that I find close to draughts in spirit. In that game, alpha-beta+patterns and MCTS both lead to top programs, which I find interesting.
Thansk for your elaborate reply and sharing your philosophy.

FYI, here's a link to the 2nd edition (work-in-progress) of the Reinforcement Learning book by Sutton & Barto. Apart from the general framework, there are 3 particularly interesting sections: 16.2 on Samuel's checkers program (that started the whol RL business), 16.9 on AlphaGo and 17.1 with a nice picture showing how TD-learning, Dynamic Programming, Monte Carlo and exhaustive tree search are all special limits of a general framework.

EDIT: forgot the link, here it is: http://incompleteideas.net/sutton/book/ ... 016sep.pdf
Last edited by Rein Halbersma on Tue Oct 11, 2016 15:21, edited 1 time in total.

Post Reply