Search Algorithm

Discussion about development of draughts in the time of computer and Internet.
Post Reply
Walter Thoen
Posts: 44
Joined: Wed Nov 17, 2010 13:26
Real name: Walter Thoen

Re: Search Algorithm

Post by Walter Thoen » Wed Mar 13, 2013 10:36

Rein Halbersma wrote: Actually, I have a strong feeling that parallel search *will* scale to 50+ cores, but it requires advanced programming environments such as used by Cilk-chess. One problem that Cilk (currently available as a special branch in gcc maintained by Intel) solves compared to most existing YBW implementations (such as present in Stockfish) is the split-point handling and load-balancing over the various cores. The tree itself contains enough parallelism as search depth increases, it's "only" a matter of keeping all the processors busy. The Cilk-scheduler does so in a provably optimal way (by randomized work-stealing rather than deterministic work-pushing in almost every hand-made implementations of YBW).
There are some technical obstacles because a Xeon Phi is a co-processor. On the cilkplus.org website there is the following warning about this:

However, you need to be aware that Cilk Plus assumes that your application is executing in a single, unified address space. Tasks that are offloaded to the Xeon Phi coprocessor are executing in a different address space than the application running on the host processor, and the thread pools on the host processor and the Xeon Phi coprocessor are totally separate. Work cannot be stolen between the host processor and the Xeon Phi coprocessor.

The reason I am not convinced that the parallel search performs well on 50+ cores are all the test results that Bert has published so far in this thread. No doubt with 50+ cores more nodes per second can be achieved. To make good use of the extra nodes you might search deeper or you might prune less to gain ELO. Or you do not increase the number of nodes, but instead you spend more evaluation time per node. Finding a single (hardcoded) optimal configuration for the search and evaluation seems to be a daunting task. The optimal configuration might also depend on game type and game phase. Hence, my interest in the idea to use multiple configurations in parallel and then arbitrate (no idea how to do this cleverly :D ).

BertTuyt
Posts: 1592
Joined: Wed Sep 01, 2004 19:42

Re: Search Algorithm

Post by BertTuyt » Wed Mar 13, 2013 23:14

Im afraid my crystal ball is also not perfect.
But anyway, I believe that to utilize the zillions of cores in the future we might do something different than plain parallel alfa-beta (or MTD-f or ...) .

I have seen a nice slide where the hardware revolution was depicted in 3 phases:
- The GHZ period: Where programs just benefit from faster clocks (when I started with Draughts this was a < 1Mhz PDP11, and now around 3 Ghz).
- The multi-core period: Here we live nowadays, and think that the algorithms as we use today (whether YBWC or alike) are sufficient to fuel the 4 - 16 cores in the next years.
- The future many cores period. Core count exploding beyond 100, 1000, ...... Here the math will change and maybe also MCTS will work in the end for Draughts as well :) .

Bert

BertTuyt
Posts: 1592
Joined: Wed Sep 01, 2004 19:42

Re: Search Algorithm

Post by BertTuyt » Mon Mar 18, 2013 23:09

Im now working on increasing the Horizon Nodes/second trough the concept of lazy evaluation.
The current implementation looks as follows:

Code: Select all

	if ( ( iTempScore >= iBeta + ( HORIZON_SOFTEVAL_VALUE ) ) ) {
		pTDepth->bEval = false ;
		bLazyEvaluation = true ;

		return ( iTempScore ) ;

	} else {
		pTDepth->bEval = true;
		bLazyEvaluation = false ;
	}
The iTempScore is only based on material and free_path (which is a possible promotion).

Do others programmers also make use of this "lazy" evaluation?
I also want to add the part where the score is also "much" lower than the iAlfa bound, but for whatever reason this didn't work out well so far...

Bert

MichelG
Posts: 244
Joined: Sun Dec 28, 2003 20:24
Contact:

Re: Search Algorithm

Post by MichelG » Tue Mar 19, 2013 12:46

BertTuyt wrote:Im afraid my crystal ball is also not perfect.
But anyway, I believe that to utilize the zillions of cores in the future we might do something different than plain parallel alfa-beta (or MTD-f or ...) .

I have seen a nice slide where the hardware revolution was depicted in 3 phases:
- The GHZ period: Where programs just benefit from faster clocks (when I started with Draughts this was a < 1Mhz PDP11, and now around 3 Ghz).
- The multi-core period: Here we live nowadays, and think that the algorithms as we use today (whether YBWC or alike) are sufficient to fuel the 4 - 16 cores in the next years.
- The future many cores period. Core count exploding beyond 100, 1000, ...... Here the math will change and maybe also MCTS will work in the end for Draughts as well :) .

Bert
I am not so sure that processors wil go to thousands of cores. Programming many-core CPU's is very hard for all programs, not just draughts. I expect there will be a combination of fast fewcore cpu's and slower but manycore CPU's for the foreseeable future. For instance you might have a combination of a single 4GHZ complex core with complex features such as out of order execution, branch predication and other complexities, combined with 4000 simple cores that run at 1 Ghz.

There also seems to be a trend towards lower (elektrical) power CPU's. Just look at the many tablet/smartphone CPU's that are created nowadays. Within a short time, many traditional desktops have been or will be replaced by tablets and smartphones.

And Intel has made it clear that it will lower the power requirements of desktop-pc's. And lower elektrical power means less computing power.

One interresting devellopment would be tera/petabyte sized memory. Then you can pre-compute an enourmous hashtable of frequently appearing positions.
BertTuyt wrote:Im now working on increasing the Horizon Nodes/second trough the concept of lazy evaluation.
...
The iTempScore is only based on material and free_path (which is a possible promotion).

Do others programmers also make use of this "lazy" evaluation?
I also want to add the part where the score is also "much" lower than the iAlfa bound, but for whatever reason this didn't work out well so far...

Bert
Dragon uses lazy eval for both the upper and lower bounds (with about 1.5 pawns margin). It speeds up the code by a fair amount.

Michel

BertTuyt
Posts: 1592
Joined: Wed Sep 01, 2004 19:42

Re: Search Algorithm

Post by BertTuyt » Sat Apr 06, 2013 17:43

In the mean time I have been working on an automatic opening book generator based on the Drop Out Expansion (DOE-ish :) ).
Within this context I very much appreciate the support of Martin Fierz, who shared all his source code.
In the end I didnt use his code, but at least the code provide some clues, hints which I did not think about (yet).

I have almost finished all main routines (although the devil is in the details).
So I can start in the next week(s) with generating a moderate Book, consisting of 10K - 20K positions.
Think this will take 1 - 2 weeks.

Every Book-entry is based on below structure:

Code: Select all

typedef struct {
	BITBOARD               bbKing ;
	BITBOARD               bbWhitePiece ;
	BITBOARD               bbBlackPiece ;
	short int              iScore ;
	short int	           iDepth ;
	unsigned short int     uFlags ;
	unsigned short int	  uReserved ;
} BOOKNODE, *PBOOKNODE ;
So every entry requires 256 Bits (32 Bytes), the uReserved does not have a "function", but I wanted to have a "round" number.
So far I do not sort the nodes on the 192 bits of bbKing-bbWhitePiece-bbBlackPiece, but i might need to do that, if the Book-size increases.
With a sorted list finding a book position is a relatively fast binary search.
Although there might be more efficient book structures, I wanted to avoid a hash based book to avoid collisions.
And as the final book will most likely not contain dramatically more than 1M nodes, 32 MByte for a Book in Memory is doable :), especially compared with memory-hungry tables like end-game databases.

The table structure provides the option to also include "zillions" of different positions in the middle game.
I added the option to expand the book for specific positions which might be interesting (supported by several new proprietary Guide commandos !book64 ..... ).
Also as the book is checked every move, it is possible that an initial drop-out of the book situation is restored.

I appreciate comments/suggestions from other programmers, related to book-construction,and/or whatever.
If all works, I will share code and book again (as usual) with all involved.

Bert

Ed Gilbert
Posts: 859
Joined: Sat Apr 28, 2007 14:53
Real name: Ed Gilbert
Location: Morristown, NJ USA
Contact:

Re: Search Algorithm

Post by Ed Gilbert » Sat Apr 06, 2013 20:32

So every entry requires 256 Bits (32 Bytes), the uReserved does not have a "function", but I wanted to have a "round" number.
So far I do not sort the nodes on the 192 bits of bbKing-bbWhitePiece-bbBlackPiece, but i might need to do that, if the Book-size increases.
With a sorted list finding a book position is a relatively fast binary search.
Although there might be more efficient book structures, I wanted to avoid a hash based book to avoid collisions.
And as the final book will most likely not contain dramatically more than 1M nodes, 32 MByte for a Book in Memory is doable , especially compared with memory-hungry tables like end-game databases.
The kingsrow opening book is 2.5 million positions and occupies 53MB of space. Each position uses 20 bytes, but the book file also has a header and a table of unique king bitboards so it is not exactly 20*N bytes on disk. 16 bytes are consumed by the bitboards of black and white pieces. The other 4 bytes are different depending on whether the position contains a king or not. Since very few opening book positions contain kings, and those that do contain kings often share a common king bitboard pattern with other book positions, a small table of unique king bitboards is part of the opening book file. In positions that have kings, the remaining 4 bytes of 20 contain an index into the table of king bitboards, as well as the search score for the position, the color of the side to move, and the best move represented as a 7-bit value between 0 and 127. Most book positions (that do not have kings) use the remaining 4 bytes for the search score, the color of the side to move, and up to 2 best moves, each a 7-bit value. There is no depth information in the book. The book depth numbers that are displayed by kingsrow when a book move is played are found dynamically by a tree walk. This is also true of the displayed book scores. The scores stored in the book nodes are only used at leaf nodes. At interior nodes, the scores are determined by minimax during the tree walk.

The book nodes are stored in a hashtable. If you make the hashtable large enough to hold about 10% more nodes than you need to store, then collisions are not a problem. On average you only need to probe a very small number of nodes to either find the position or determine that the position is not present in the book.

-- Ed

MichelG
Posts: 244
Joined: Sun Dec 28, 2003 20:24
Contact:

Re: Search Algorithm

Post by MichelG » Wed Apr 17, 2013 12:52

BertTuyt wrote:In the mean time I have been working on an automatic opening book generator based on the Drop Out Expansion (DOE-ish :) ).
Within this context I very much appreciate the support of Martin Fierz, who shared all his source code.
In the end I didnt use his code, but at least the code provide some clues, hints which I did not think about (yet).

I have almost finished all main routines (although the devil is in the details).
So I can start in the next week(s) with generating a moderate Book, consisting of 10K - 20K positions.
Think this will take 1 - 2 weeks.

Every Book-entry is based on below structure:

Code: Select all

typedef struct {
	BITBOARD               bbKing ;
	BITBOARD               bbWhitePiece ;
	BITBOARD               bbBlackPiece ;
	short int              iScore ;
	short int	           iDepth ;
	unsigned short int     uFlags ;
	unsigned short int	  uReserved ;
} BOOKNODE, *PBOOKNODE ;
So every entry requires 256 Bits (32 Bytes), the uReserved does not have a "function", but I wanted to have a "round" number.
So far I do not sort the nodes on the 192 bits of bbKing-bbWhitePiece-bbBlackPiece, but i might need to do that, if the Book-size increases.
With a sorted list finding a book position is a relatively fast binary search.
Although there might be more efficient book structures, I wanted to avoid a hash based book to avoid collisions.
And as the final book will most likely not contain dramatically more than 1M nodes, 32 MByte for a Book in Memory is doable :), especially compared with memory-hungry tables like end-game databases.

The table structure provides the option to also include "zillions" of different positions in the middle game.
I added the option to expand the book for specific positions which might be interesting (supported by several new proprietary Guide commandos !book64 ..... ).
Also as the book is checked every move, it is possible that an initial drop-out of the book situation is restored.

I appreciate comments/suggestions from other programmers, related to book-construction,and/or whatever.
If all works, I will share code and book again (as usual) with all involved.

Bert
This seems like an interesting idea; dragon's book is not that big, and I have start working on DOE expansion myself as well.

I use a dynamically sized hash-table for the storage of the book, that seems to works well. I also store some additional data in the book like search-depth and the full position. The book won't grow much beyond a few million positions, so the storage should not be any problem.

One question is still open to me: to what depth do you search the leaf-nodes?

You could search for, lets say 10 seconds, but if the game level is set to a higher level, then the book is not good enough.

Also, if you train the book at 10 seconds/move and hardware become much faster in the next few years, the book might get outdated.

And if the time per move is too high, you get less positions in the book. What did you choose as optimum?

Do you plan on a client/server architecture, so that you have many clients running single-threaded for increased efficiency?

Michel

TAILLE
Posts: 968
Joined: Thu Apr 26, 2007 18:51
Location: FRANCE

Re: Search Algorithm

Post by TAILLE » Thu Apr 18, 2013 22:17

MichelG wrote:
BertTuyt wrote:In the mean time I have been working on an automatic opening book generator based on the Drop Out Expansion (DOE-ish :) ).
Within this context I very much appreciate the support of Martin Fierz, who shared all his source code.
In the end I didnt use his code, but at least the code provide some clues, hints which I did not think about (yet).

I have almost finished all main routines (although the devil is in the details).
So I can start in the next week(s) with generating a moderate Book, consisting of 10K - 20K positions.
Think this will take 1 - 2 weeks.

Every Book-entry is based on below structure:

Code: Select all

typedef struct {
	BITBOARD               bbKing ;
	BITBOARD               bbWhitePiece ;
	BITBOARD               bbBlackPiece ;
	short int              iScore ;
	short int	           iDepth ;
	unsigned short int     uFlags ;
	unsigned short int	  uReserved ;
} BOOKNODE, *PBOOKNODE ;
So every entry requires 256 Bits (32 Bytes), the uReserved does not have a "function", but I wanted to have a "round" number.
So far I do not sort the nodes on the 192 bits of bbKing-bbWhitePiece-bbBlackPiece, but i might need to do that, if the Book-size increases.
With a sorted list finding a book position is a relatively fast binary search.
Although there might be more efficient book structures, I wanted to avoid a hash based book to avoid collisions.
And as the final book will most likely not contain dramatically more than 1M nodes, 32 MByte for a Book in Memory is doable :), especially compared with memory-hungry tables like end-game databases.

The table structure provides the option to also include "zillions" of different positions in the middle game.
I added the option to expand the book for specific positions which might be interesting (supported by several new proprietary Guide commandos !book64 ..... ).
Also as the book is checked every move, it is possible that an initial drop-out of the book situation is restored.

I appreciate comments/suggestions from other programmers, related to book-construction,and/or whatever.
If all works, I will share code and book again (as usual) with all involved.

Bert
This seems like an interesting idea; dragon's book is not that big, and I have start working on DOE expansion myself as well.

I use a dynamically sized hash-table for the storage of the book, that seems to works well. I also store some additional data in the book like search-depth and the full position. The book won't grow much beyond a few million positions, so the storage should not be any problem.

One question is still open to me: to what depth do you search the leaf-nodes?

You could search for, lets say 10 seconds, but if the game level is set to a higher level, then the book is not good enough.

Also, if you train the book at 10 seconds/move and hardware become much faster in the next few years, the book might get outdated.

And if the time per move is too high, you get less positions in the book. What did you choose as optimum?

Do you plan on a client/server architecture, so that you have many clients running single-threaded for increased efficiency?

Michel
Hi Michel, Bert

I have never worked on an automatic opening book generator and I am not familiar with Drop Out Expansion. My opening book is rather small (some thousands of positions) and was manually built.
Anyway I do not understand why you do not use your "standard" hash table to store your book. That is my approach in Damy. I only had to add a mark saying that the corresponding entry in the hash table cannot be replaced by another one. I you like you can easily handled such a mark saying that such entries are protected as long as you do not reach the 20th move or as long as the number of pieces is superior to 30 or ... whatever you want.
Are collisions between book positions a real issue? First of all with a big hash table the chance to have a collision is quite small, and in addition it seems not a real problem to not be able to use the book in rare occasions, is it?
BTW I also use my hashtable to store some middle game positions (of course protected entries)
Gérard

MichelG
Posts: 244
Joined: Sun Dec 28, 2003 20:24
Contact:

Re: Search Algorithm

Post by MichelG » Sun Apr 21, 2013 12:17

TAILLE wrote: Hi Michel, Bert

I have never worked on an automatic opening book generator and I am not familiar with Drop Out Expansion. My opening book is rather small (some thousands of positions) and was manually built.
Anyway I do not understand why you do not use your "standard" hash table to store your book. That is my approach in Damy. I only had to add a mark saying that the corresponding entry in the hash table cannot be replaced by another one. I you like you can easily handled such a mark saying that such entries are protected as long as you do not reach the 20th move or as long as the number of pieces is superior to 30 or ... whatever you want.
Are collisions between book positions a real issue? First of all with a big hash table the chance to have a collision is quite small, and in addition it seems not a real problem to not be able to use the book in rare occasions, is it?
BTW I also use my hashtable to store some middle game positions (of course protected entries)
With the limited amount of positions in the book, the storage method isn't very important. I don't want any collisions though (i don't want to spend time on considering all the bad effects it could have on the DOE generation), so i used a hash-based table that resolves collisions (it just stores the position in the first free slot, and expands the table if there are no free slots)

Michel

Ed Gilbert
Posts: 859
Joined: Sat Apr 28, 2007 14:53
Real name: Ed Gilbert
Location: Morristown, NJ USA
Contact:

Re: Search Algorithm

Post by Ed Gilbert » Sun Apr 21, 2013 15:20

Anyway I do not understand why you do not use your "standard" hash table to store your book. That is my approach in Damy. I only had to add a mark saying that the corresponding entry in the hash table cannot be replaced by another one. I you like you can easily handled such a mark saying that such entries are protected as long as you do not reach the 20th move or as long as the number of pieces is superior to 30 or ... whatever you want.
Are collisions between book positions a real issue? First of all with a big hash table the chance to have a collision is quite small, and in addition it seems not a real problem to not be able to use the book in rare occasions, is it?
BTW I also use my hashtable to store some middle game positions (of course protected entries)
I do not do this because the requirements for the book database are quite different than for the transposition table. You do not need to probe the book db at interior nodes in your search tree, only at the root. If the root position is not in the book then do a normal search. When your opening book gets to be large (>1M pos) then it would start consuming a non-negligible part of your TT. Also, the information stored in my book nodes is different from the TT nodes. One difference is that the book nodes have 2 best moves, where the TT nodes only have 1. I think it is important to have the possibility of at least 2 best moves so that your engine does not always play the same move in a book position, which would make it too predictable and vulnerable to opponents preparing counter play against it.

The opening book generator program uses a much different book file format than the book db that is used by the engine. The generator program stores a lot of additional information in its source book file -- things like search time, search depth, search speed and nodes, and engine version. The file itself is in ASCII format so that additional fields can be added over time and still be compatible with earlier nodes, somewhat like xml. The books that I built for 8x8 checkers were created over a period of several years, because there were no 8-core computers. During this time the engine was also being developed, and I was learning more about the DOE process so my algorithms were evolving. I once introduced a serious bug in the engine and did not discover it until after I had used it to add a few thousand nodes to the book. I was able to identify the suspect nodes and research them. As computers got faster I was able to update the book by researching some of the nodes that had been searched with a slow computer, or with a shorter search time than I subsequently wanted.

-- Ed

BertTuyt
Posts: 1592
Joined: Wed Sep 01, 2004 19:42

Re: Search Algorithm

Post by BertTuyt » Sun Apr 21, 2013 23:49

One question is still open to me: to what depth do you search the leaf-nodes?

You could search for, lets say 10 seconds, but if the game level is set to a higher level, then the book is not good enough.

Also, if you train the book at 10 seconds/move and hardware become much faster in the next few years, the book might get outdated.

And if the time per move is too high, you get less positions in the book. What did you choose as optimum?

Do you plan on a client/server architecture, so that you have many clients running single-threaded for increased efficiency?

Michel, im still working on the DOE itself, especially i want to mix heuristic values with database results, and want to implement cycle-detection. So this keeps me most busy these days.

If all works, I want to search leaf nodes to 24 ply depth.
And i might add information to the nodes (like Ed), which supports node update in a later stage.
I also want to enable parallel BOE-construction.

Bert

BertTuyt
Posts: 1592
Joined: Wed Sep 01, 2004 19:42

Re: Search Algorithm

Post by BertTuyt » Sun Apr 21, 2013 23:56

Anyway I do not understand why you do not use your "standard" hash table to store your book. That is my approach in Damy. I only had to add a mark saying that the corresponding entry in the hash table cannot be replaced by another one. I you like you can easily handled such a mark saying that such entries are protected as long as you do not reach the 20th move or as long as the number of pieces is superior to 30 or ... whatever you want.
Are collisions between book positions a real issue? First of all with a big hash table the chance to have a collision is quite small, and in addition it seems not a real problem to not be able to use the book in rare occasions, is it?
BTW I also use my hashtable to store some middle game positions (of course protected entries)
Gerard, so far I split the Book and the hash-table, as in the initial phase I want to make things not more complex, and therefore I want to use them seperate.
Although your approach would be possible for now, I might change the Book-format and content in the future, so future hashtable and book-data structure wont match anymore. This is another reason to keep them apart from the start, as this enables some design freedom which I might use...

Bert

BertTuyt
Posts: 1592
Joined: Wed Sep 01, 2004 19:42

Re: Search Algorithm

Post by BertTuyt » Sun May 12, 2013 13:57

Next to the DOE approach to generate a Opening Book , I have started a parallel path.

I have several TurboDam Databases (all legally obtained :) ), which contain zillions of positions.
Although the TurboDambase format is proprietary , I was able to "understand" at least the "older" format (although in this format the first move was encrypted).
So already for quite some time I made a program which was able to derive information from this database.

I modified this program lately to generate (based on the database) a txt file with all FENs for all positions.
This way I was able to extract from the mega1997.pd file , which contained around 168K games a text file with around 17M positions (including the start position which I wanted to include as a check) . The output text file (BookFEN.txt) is around 1.3GByte.
I also did a quick test to calculate the number of unique positions which was 13.9M

I'm now "studying" what to do with this file :). As using all positions (and move sequences) dont make sense, due to "blunder" moves.
Also time is a constraint here. If I calculate (as an example) the 16-ply search score for every position , and average search time is around 1 second, it will take (without parallel processing) 160 days.
Although 160 days is not infinite, I rather spent computer time in a different way.
With 4 processors (as this problem is highly scalable !!) it takes 40 days, which might be acceptable.
Also a collaboration effort on multiple computers could help here..

I'm open for suggestions, sharing, whatever...

Keep you posted.

Bert

Rein Halbersma
Posts: 1722
Joined: Wed Apr 14, 2004 16:04
Contact:

Re: Search Algorithm

Post by Rein Halbersma » Mon May 13, 2013 09:07

BertTuyt wrote:Next to the DOE approach to generate a Opening Book , I have started a parallel path.

I have several TurboDam Databases (all legally obtained :) ), which contain zillions of positions.
Although the TurboDambase format is proprietary , I was able to "understand" at least the "older" format (although in this format the first move was encrypted).
So already for quite some time I made a program which was able to derive information from this database.

I modified this program lately to generate (based on the database) a txt file with all FENs for all positions.
This way I was able to extract from the mega1997.pd file , which contained around 168K games a text file with around 17M positions (including the start position which I wanted to include as a check) . The output text file (BookFEN.txt) is around 1.3GByte.
I also did a quick test to calculate the number of unique positions which was 13.9M

I'm now "studying" what to do with this file :). As using all positions (and move sequences) dont make sense, due to "blunder" moves.
Also time is a constraint here. If I calculate (as an example) the 16-ply search score for every position , and average search time is around 1 second, it will take (without parallel processing) 160 days.
Although 160 days is not infinite, I rather spent computer time in a different way.
With 4 processors (as this problem is highly scalable !!) it takes 40 days, which might be acceptable.
Also a collaboration effort on multiple computers could help here..

I'm open for suggestions, sharing, whatever...

Keep you posted.

Bert
Bert,

You do know that TurboDambase has an "export to PDN" menu entry? That will generate a plain text PDN file of roughly 400 Mb. No need for reverse engineering.

Rein

BertTuyt
Posts: 1592
Joined: Wed Sep 01, 2004 19:42

Re: Search Algorithm

Post by BertTuyt » Mon May 13, 2013 10:28

Rein, I know. But as the conversion program was already available, it was easy to start with it.
When I had to begin from scratch, (and with available huge memory these days and more processing power), I would also have used the pdn option.

Bert

Post Reply