I've implemented Feike's data structure with a Position struct holdingRein Halbersma wrote: But that detail aside, I think the code indicates that Horizon uses the copy-make approach to move generation both for the reversible and the irreversible parts of a position. But it also does not have an explicit array of hash keys to do repetition detection. Rather it uses an implicit array through pointer to the previous position.
This is also the way Don Dailey described how his CilkChess program (and its current program Komodo) works. Now we know of course that CilkChess was a very strong parallel program. Part of that was the ease of doing parallel computations by not maintaining a big global state but copying a small state and tracing history through pointers. The Cilkchess state was 192 bytes (3 cache lines) but I think the "tnode" struct of Horizon is even less than 64 bytes (single cache line!!) so the copy overhead should be a lot smaller than in a chess program. Although I have no idea how often (in % of number of move generated) a new thread is spawned, copying a big global state with a long history (several kilobytes?!) could easily be more expensive than the copy-make approach.
Rein
all *current* state information (pieces/kings bitboards, color, hash
key, number of moves since last conversion, and some other counters)
and a pointer to the previous position. I've padded this struct to
make it exactly a 64 byte cache line. During a search, I copy this
struct into a new position, and then make the move on that copy. The
search is then called with a reference to this copy. Afterwards, there
is no undo at all. The perft numbers without bulk-counting show a
minor speed increase.
This data structure is exactly the same as in CilkChess. One big
advantage is that it it makes the search automatically re-entrant
compared to doing make/undo on a fixed position. There was some
discussion on Talkchess a year ago about copy-make vs make-undo. Bob
Hyatt was writing that the excessive copying could blow out the memory
bandwidth. At least with my single core P4 I don't have that problem.
For now, I think I'll keep this new data structure as it would also
remove a few hundred lines in my undo code.
Rein