Flits self-learning mode

BertTuyt · Post by **BertTuyt** » Fri May 22, 2020 00:49

Interesting discussion.

As often shared, I guess most kudos go to Fabien as he was the first (although to my knowledge also Michel used this principle in Dragon) who applied it in Draughts (base ideas from Othello M Buro if I remember well), and provided all sources.

Ed help me with the optimization program as I was not sure if my games were ok (as we all know the right data and input is key for ML), and by using at least a base which worked, I knew that if things were still not ok, then it was my games.
And again you need to do a lot before you get reliable games (and i tried many options).
Would I solved all without Ed help?

I guess so, as (like others) I studied Physics and Mathematics (and for the data processing lots of Computational Physics), and I have sufficient know how to make gradient descent work (as Im quite familiar with it).

Even from scratch, as there are many libraries already available for this purpose.
So in the end I think I would have taking an additional 1 - 2 months to reach the results which I have now.

But most work is still fine tuning the evaluation, trying new things, adding features, removing features.
Next to that I study the Stockfish code, and want to try every idea in the search().
Not sure how many are doing that in parallel.

Also on my list is the move generator, where I constantly try to make it faster (the Perft tests).
And I can go on, mixing search with specific Draughts heuristics, combining evaluation and search with breakthrough tables.
Changing YBWC to Lazy SMP.....

I most likely started somewhere in 1970 with the first ideas, so after 50 years i have a 1-2 month advantage from Ed, so thanks for that

So yes I have an advantage by starting quite young, and being 63 (close to retirement

).

Bert

Sidiki · Post by **Sidiki** » Fri May 22, 2020 08:00

Hi everybody,

I'm really sorry, though, what originally, as I have meant more than once was just for fun. Again if Ed, who more than once enlightened me in secret on certain points of Kingsrow or allowed to carry out experiments to run Kingsrow on Android.
First of all, I would like to clarify one thing: never has any programmer with whom Ed has had to exchange program codes come into contact with me and vice versa.

The opening library created does not require any notion of programming whatsoever. A primary school kid could do it, just know the process.

Kingsrow and Scan have opposite styles. Scan is aggressive and Kingsrow defender.
What gives him the advantage on Scan, is that during the attacks of the latter, he "neglects" the combinations and loses, because, I must admit, Kingsrow has a "look" very very developed for combinations, you will notice that the majority of games won by Kingsrow against Scan were the result of combinations.

So I adopted, a selection of games by great masters like Ton Sijbrands, Ndjofang, and I admit some Scan games for attack games without the risk of losing pawns, combinations or uncomfortable position less of 20 moves.

It is the opposite that I did with Scan, with games by Guntis Valnéris, Aman Eugène and Kingsrow which are more defensive than offensive.

No one will ever use it in a competition, it's for fun that I share it with you.
I am a player, and I know that the start of the game plays a very very important role in the strength of an international grandmaster.

I know that almost all current programs have very good levels, which makes, in addition to the evaluation function of each program that one software is stronger than another is the opening book.
Here programmers often create book which later catch up with their programs, although they are strong.

It is the same for a software, even that would have an endgame database of 10 pawns, would not draw a lost game with 10 pieces.

Sorry for the inconvenience.

Maximus plays the classic game a lot, so the anti-Maximus will be pure classic from Georgiev, Chizov, Baliakin and a little from Kingsrow.

See you Jan Jaap tonight.

Sidiki

Rein Halbersma · Post by **Rein Halbersma** » Sat May 23, 2020 10:10

Ed Gilbert wrote: Thu May 14, 2020 20:46 I played a quick match between kr_hub and flits_hub. Flits_hub is an engine that I got from Klaas Bor. Klaas has adapted Turbo Dambase to use Hub engines, and he wrote a Hub interface to Flits to use for testing. Don't ask me or Klaas for this, he did this as a favor to me, and with Adri V.'s permission, because I helped him interface TDam to kr_hub.

So this is exciting news! Does this mean that there will be a TurboDambase version with Kingsrow as the analysis engine? Or even better, with the 8pc endgame dbs as well?

Rein Halbersma · Post by **Rein Halbersma** » Sat May 23, 2020 10:16

Ed Gilbert wrote: Mon May 18, 2020 23:25 I don't fully understand how this book was created, and I'm not sure that I think this is an acceptable thing to do from a "fair competition" point of view. You could probably copy the entire kingsrow opening book off 1.5M positions with a little bit of programming, and use it in another engine. Do the programmers think that would be acceptable? What is the essential difference between coping an opening book and what is being done to create this flits book? As a practical matter, I don't think the opening book is very important for 10x10 engines, and I would be surprised if flits became very much stronger if it had a large opening book. But still, I would like to understand how this book is being created.

Recreating another engine's book from black box reverse engineering seems entirely above board to me. So not looking at the binary files or doing de-compilation, but just playing 1.5M openings and logging when the reply is instant should give you the entire variation tree but without the various eval score per node. Doing your own dropout expansion on this variation tree and computing your own engine's eval scores on it and trying to find best responses to the Kingsrow book, to me is part of the competitive process.

Rein Halbersma · Post by **Rein Halbersma** » Sat May 23, 2020 10:42

jj wrote: Thu May 21, 2020 19:21 In general, testing against available engines to improve your own is common practice, and that includes the opening book, I suppose (I am not there yet). But yes, taking this to the extreme it starts to look like reverse engineering. To copy the entire Kingsrow opening book with a little bit of programming and use it in another engine is of course not acceptable. Where to draw the line? And the problem with forbidding things is how to enforce them. A possibility is to play a tournament without opening books, provided all programs have this option.

I think there is a big difference between directly interfacing (through de-compilation or other white box techniques) the Kingsrow book (with the positions and Kingsrow eval scores) that comes with its public version and black box reverse engineering. The former is against tournament rules, but the latter requires a serious programmer effort: writing a testing framework, building your own opening variation tree, using your own eval and dropout expansion to refine the book etc. I don't see anything wrong with that.

I think the matter of fairness becomes a matter of whether engines are publicly available and whether they play public matches. Anybody can download Kingsrow, Scan, Damage, a few people have Maximus, less people have Sjende Blyn, etc. Maximus played a number of public matches, Sjende Blyn effectively refuses to play matches. The more private an engine is, the more advantage it potentially has to public engines.

This is true, but the same is true for human play. Sijbrands has 2000+ games in Turbo Dambase, and anyone preparing for a match against him can use that. In tournament play with serious stakes, the top chess programs also devote serious CPU hours to opening preparation beyond what is publicly available. But opening preparation in draughts is overrated in any case.

Another thing about "fair competition" is the fact that Ed shared his optimization program only with Bert and with no-one else. With this program and the explanation offered by Ed every programmer can generate a very strong evaluation function in a short period of time. In my opinion Bert now has an advantage over other programmers who don't have access to the optimization program. Bert did share the games he generated but not the optimization program, this is Ed's to share of course. The generated games are not that useful from a research point of view, as they were generated by a black box using a simplified Kingsrow evaluation function. Evaluation functions derived from these games indirectly derive from Kingsrow evaluation and with the use of Ed's optimization program we might speak of "Damage powered by Kingsrow". What do the other programmers think, is this acceptable?

Yes, I have absolutely no problems with it. First of all, the optimization program is not part of the engine itself, but merely tooling to create a strong eval function. Scan doesn't release it's own ML pipeline used to compute the eval weights. Authors of different engines can collaborate on this, either by exchanging ideas or even full source code. This is entirely different from copying another engine's weight files (that is against tournament rules).

I am not asking for myself because I generated my own games from zero knowledge and made my own optimization program. This I did using the public information given by Fabien and Ed on this forum and by studying neural network theory. I have a few small innovations (some of which are also used by Bert now) but in spirit Maximus' evaluation function is indebted to Scan 3.0 and up for future research. In 2018 Ed and I compared notes, we have many differences so a comparison would still be interesting. Understanding this subject and getting it all to work is a rite of passage as Fabien calls it and it is a lot of work. In my opinion, giving one person access to such a powerful tool and not making it publicly available makes for an unfair competition. It is unfair to the people who don't have access to an optimization program and in a way also to the people who put in the work to make their own.

Well, I'm sure Ed / Bert collaborating and exchanging such tooling is mutually beneficial to both of them. They have a long history of fruitful collaboration (the whole DXP tooling around Flits/Truus e.g.) Why would one of them systematically give the other tools that put him at a disadvantage? But even if that were true, tough luck! And sure, that puts other authors at a disadvantage. It's the same when a small group of human GMs work together in training to mutually improve at the detriment of their competitors. Unless you can add your own unique added value, you have no right to be included in such a training group. Similarly, anyone looking for a handout on ML optimization tooling, should have something worth trading for.

And for years, Ed had a huge competitive advantage from his decades of endgame building experience. After Bert, Michel and Gerard caught up with him and managed to build 7 and 8 pc dbs, Ed voluntarily open sourced the Kingsrow drivers to his dbs. But not (yet?) the db building tools. Maybe the same will happen with ML tools, or maybe not.

I won't be using any handwritten gradient descent code from any draughts author however. The publicly available professional libraries (TensorFlow, PyTorch, or even SciPy) are far superior to what any of us can write here (and yes, I have written gradient descent code in my studies, 25 years ago). If I ever get around to it (into Stratego nowadays, little time for draughts), I would release any tooling that transforms draughts positions into a format that can be used by such professional ML libraries (since all my code is open source anyway).

Rein Halbersma · Post by **Rein Halbersma** » Sat May 23, 2020 10:51

Ed Gilbert wrote: Thu May 21, 2020 22:35 - I kind of lost interest in the experiment when I heard what the self-learning was all about. I agree that doing this just for fun, as an experiment to see how much flits can improve with an "anti-kingsrow" book is perfectly acceptable. I guess the point of my post, although maybe I didn't make it very well, is that it doesn't feel right to make a book that specifically counters every move of another program's book, and then use that in competition, as that sounds like it's basically a form of copying a book. It's just my opinion, and I was interested to see what other engine developers thought about it.

Ed, you have played thousands of games against Flits/Truus and probably also against Scan/Damage. What would be wrong with you parsing all those PGNs into a big opening variation tree and running dropout expansion against that? This is exactly how humans prepare against each other's published play. You would still be using your own eval to improve your own book. You maybe copying (in a very indirect way) the opening book moves, but not copying the eval values. And if that would be OK, why not go even further and systematically probe a publicly available engine and reconstruct it's entire opening book and run DOE against that? To me, this seems entirely in the spirit of reinforcement learning since it is aimed at improving your own engine rather than verbatim copying another engine.

Another thing about "fair competition" is the fact that Ed shared his optimization program only with Bert and with no-one else. With this program and the explanation offered by Ed every programmer can generate a very strong evaluation function in a short period of time. In my opinion Bert now has an advantage over other programmers who don't have access to the optimization program.
I never thought about it that way, but I can see your point. I shared the code with Bert because he asked me for it, and we had collaborated on other projects in the past, like the dxp wrappers for truus and flits, and Bert had shared some of his language localization code with me. I don't think there's anything proprietary or new in the code. Gradient descent is a well documented algorithm, and there's a lot of information available describing how to implement it. It is a lot of work though to develop and debug, or at least it was for me. Perhaps I should think about publishing the source.

Of course, I'd welcome any code that you would publish but I don't think you have any moral obligation to do so. IMO, the most valuable part would be the part of extracting PGNs to a datafile of positions that is suitable to be run against gradient descent. The gradient descent code itself is probably best done using a professional library.

jj · Post by jj » Sat May 23, 2020 13:25

BertTuyt wrote: Fri May 22, 2020 00:49 Interesting discussion.
...
So yes I have an advantage by starting quite young, and being 63 (close to retirement ).

Bert

Bert,

It is clear that you could not have done it without the extensive help of Ed so your account of the situation is not credible. Once someone shows you how to get a first good result it is easy to start fiddling with the parameters.

Let me ask you, did you even ask Ed if you could use his code to reveal (for you) the format of his weights.bin file and to use that to generate games? Did you ask him if you could use his optimization code as-is to tune your own weights? Did Ed actually give you any of these permissions? Or was he just trying to help you to create your own code and did you take advantage of the code you were trusted with?

The way I see it Bert took at least two shortcuts. What do the other programmers think, is this acceptable? Should "Damage powered by Kingsrow" be allowed to compete in tournaments?

Jan-Jaap

jj · Post by jj » Sat May 23, 2020 13:31

Rein, thanks for expressing your opinion. I think you will agree with me that using somebody else's tools can only be done with the author's permission. And that has not been established in this case.

Rein Halbersma wrote: Sat May 23, 2020 10:51 The gradient descent code itself is probably best done using a professional library.

I doubt that, but maybe Fabien and Ed can comment on this.

Jan-Jaap

BertTuyt · Post by **BertTuyt** » Sat May 23, 2020 14:11

JJ,

when I asked Ed if I could use his optimization program he shared his tool with me.
As he shared also the source, I was able to understand how the input file looks like, and what the output format was.
I changed the program to deal with my evaluation features, and my slightly different input and output.

In all cases I constantly aligned with Ed, and I also shared the changes I made.
And the new tools I generated I forwarded to him.
And where I was not sure, Ed provided me with support.

So yes, I'm very grateful, for all the help from Ed side.
As Rein stated, I think collaboration between programmers is not forbidden, and should not be forbidden to my knowledge.
I guess (like all of us), within the Draughts community we not only share in the forum, but there is also email exchange.
From my side I discussed many topics with Adri Vermeulen, Stef Keetman, Ed, Michel (to name a few), and many more.

Next to that (to my knowledge) I also shared several ideas/topics in this forum.
So what is the point you want to make, and your issue?

Did I have an advantage by collaboration with many draughts programmers, yes I did.
Should I feel guilty by that, no i don't.
Think we all built upon the shoulders of giants, and much we use nowadays is the work of zillions of people, who shared ideas.

Hope this answers your post.

Bert

BertTuyt · Post by **BertTuyt** » Sat May 23, 2020 15:01

JJ, to make sure here your questions again:

Let me ask you, did you even ask Ed if you could use his code to reveal (for you) the format of his weights.bin file and to use that to generate games?

I got the optimization tool to generate my own weights based upon my own games, and own evaluation.
The tool was with source, so I could change the input, as I wanted that the optimization tool had no knowledge about the evaluation, so I separate them.
All the knowledge about the evaluation is in the tool i made (called cpn) that reads the .pdn files, derives the relevant positions, and for every position a feature vector is constructed. I have 2 types of features, pattern-features (where I only provide the pattern index to the optimization tool), and property features which have a numerical value.

I changed the input processing of the tool from Ed, so next to the file with all positions, I add an .ini file which describes the features (masks, pattern and property, this was all new.

Did you ask him if you could use his optimization code as-is to tune your own weights?

Yes I did, as this was the purpose of the exercise, first to check if the games quality was ok, and i needed several loops (like all) to optimize the match conditions. When i saw the first positive results i started a run with the old Damage Engine to generate all the games (took 10 days if i remember well on a 8 core 4 Ghz machine). But I could not use the tool as-is, as there are evaluation differences, and some more details (Ed uses all positions and applies a Q-search to derive the end-points, I remove positions with a capture).

Did Ed actually give you any of these permissions?

Yes he did, as there was open communication on a regular basis.
He also tested the version now made available which is 50 ELO below Kingsrow.

Bert

BertTuyt · Post by **BertTuyt** » Sat May 23, 2020 15:22

JJ, and more information about the story.

I generated many games with different settings (constant time, and constant depth, different search parameters, and much more).
I shared these files with Ed, so Ed run it on his tool, and he could use the weights.bin file he made to run it with Kingsrow, and I did the same (in the meantime the original Ed tool was changed, only the base algorithm gradient descent with mini batches was the same).

Initially it did not work.
I guess the breakthrough came when i switched to bullet games (with constant time), and with a (more or less) base alpha-beta search as pruning throws away the baby with the bathwater, especially with bullet games.

So one day Ed told me he sees far better results, but my tool (= the tool from Ed with change input, an no evaluation code) still produced non-sense. A final check revealed that i made a bug in the cpn tools (which reads .pdn and converts to an optimization file). After that change it went crescendo.

That was the moment i started with the huge games generation run, and when that was ready a started to test all kinds of evaluation features and patterns (which I'm still doing).

So if you say powered by Kingsrow, I would like to change that, powered by a stable gradient descent program with mini batches (and fine tuned learning rates) by Ed, used to test different evaluation functions, and generated weight files, based upon a games set made by a previous Damage Engine (which contained many errors in the evaluation, but you need mistakes to learn).

Bert

Rein Halbersma · Post by **Rein Halbersma** » Sat May 23, 2020 15:32

jj wrote: Sat May 23, 2020 13:31 Rein, thanks for expressing your opinion. I think you will agree with me that using somebody else's tools can only be done with the author's permission. And that has not been established in this case.

If someone sends me a tool (binary or source) then I think permission is implied. So if that tool can parse a file of positions into a table of features and do gradient descent on it to get a bunch of feature weights, I would certainly use it to see what comes out of it. Using the source to reverse engineer the binary weight file in order to read in this weight file is not something I would ever do (this is what Joost did with Scan).

Rein Halbersma wrote: Sat May 23, 2020 10:51 The gradient descent code itself is probably best done using a professional library.
I doubt that, but maybe Fabien and Ed can comment on this.

Why would you doubt that? Optimization is a very specialized field with lots of improvements on top of the basic textbook gradient descent. Since ~2012 onwards especially, stochastic gradient descent with special tricks (momentum, automatic differentiation) and GPU accelerated computation has made it possible to optimize huge models with millions of parameters on billions of data points that don't fit into RAM. I doubt anyone here is able to independently make these kind of improvements on the textbook one-line implementation of gradient descent. E.g. automatic differentiation alone is non-trivial to implement (I have done a toy version, to understand the principle, but I would never rely on it).

FWIW, I have extensive experience doing optimization on all kinds of big data problems using libraries from SAS, Stata, Matlab, R and Python. Selecting the correct algorithm flavor, algorithm parameters and initial conditions is hard enough (the results can fluctuate, both in time to solution and in accuracy). Getting to learn a good library takes time, and adapting your own data generating process (playing millions of games in this case and extracting positions and features) takes even more time, but the savings in terms of bug-free optimization are a no-brainer.

Rein Halbersma · Post by **Rein Halbersma** » Sat May 23, 2020 15:43

jj wrote: Sat May 23, 2020 13:25 The way I see it Bert took at least two shortcuts. What do the other programmers think, is this acceptable? Should "Damage powered by Kingsrow" be allowed to compete in tournaments?

Since you ask other programmer's opinions: I find the phrase "Damage powered by Kingsrow" accusatory. If Damage has the unmodified Kingsrow weight file, then yes, it would be powered by Kingsrow. The same as Joost's program Argus was powered by the Scan weights. But if Damage generated its own games, with its own features (or even Ed's features for that matter) and used Ed's gradient descent code, then it's not powered by Kingsrow, IMO.

Suppose someone came in a tournament using Ed's endgame db driver? Would that be "powered by Kingsrow"?

jj · Post by jj » Sat May 23, 2020 15:53

A lot of text now. First I would like to know if Ed does confirm Bert's account.

jj · Post by jj » Sat May 23, 2020 15:54

Rein Halbersma wrote: Sat May 23, 2020 15:43 Suppose someone came in a tournament using Ed's endgame db driver? Would that be "powered by Kingsrow"?

You know very well this driver is public, and it was only for practical reasons (no Elo difference).

World Draughts Forum

Flits self-learning mode

Re: Flits self-learning mode

Re: Flits self-learning mode

Re: Flits self-learning mode

Re: Flits self-learning mode

Re: Flits self-learning mode

Re: Flits self-learning mode

Re: Flits self-learning mode

Re: Flits self-learning mode

Re: Flits self-learning mode

Re: Flits self-learning mode

Re: Flits self-learning mode

Re: Flits self-learning mode

Re: Flits self-learning mode

Re: Flits self-learning mode

Re: Flits self-learning mode