Internet engine matches
Re: Internet engine matches
Maybe in several years we only remember the answer , which is 17, but forgot the question.
We are lucky that the Elo difference is not 42.
Bert
We are lucky that the Elo difference is not 42.
Bert
-
- Posts: 299
- Joined: Tue Jul 07, 2015 07:48
- Real name: Fabien Letouzey
Re: Internet engine matches
I guess 17 Elo is a lot in draughts, although this would have little impact in a tournament.Rein Halbersma wrote:So Kingsrow is only about 17 ELO points better than Scan.
I also assume that King's Row is not using large-scale machine learning, so this is important: ML as an alternative rather than a must-have as was in Othello. It seems that one can use either maths or knowledge/intuition + a lot of testing; I find that interesting.
-
- Posts: 1722
- Joined: Wed Apr 14, 2004 16:04
- Contact:
Re: Internet engine matches
Actually, it's a bit more complicated. The normal ELO system uses a binomial logistic distribution to model predicted scores. In draughts, the drawing margin is so high, that a trinomial logistic distribution is more appropriate. Then if F[x] = 1/(1+10^(-x/400)) (usual ELO formula), you have to fit F[-draw+delta] == WIN_RATE and F[draw+delta]== LOSS_RATE. Plugging in the numbers, I (well, Mathematica...) find that draw == 438 and delta = 62 for the above match of 981 games.Fabien Letouzey wrote:I guess 17 Elo is a lot in draughts, although this would have little impact in a tournament.Rein Halbersma wrote:So Kingsrow is only about 17 ELO points better than Scan.
So in effect, the high drawing margin is hiding a much greater strength difference (62 vs 17 points) than if the drawing margin would have been around 0. The drawing margin of over 400 points is really killing the game.
As a consequence, I don't agree with Bert's conjecture that draughts programs are near perfect play. Even with much smaller difference, I still think there is a lot of missing knowledge in draughts programs that leads to suboptimal positional play. It's just not exposed because every opponent is also missing that knowledge. High drawing margin only means that programs are equally efficient in their current search/knowledge, not with respect to a hypothetical perfect play standard.
I wonder what the limits of machine learning are. It seems ideally suited to exploit local patterns (in particular locks, perhaps breakthroughs), but global concepts (total terrain advancement, tempo development etc.) seem hard to fit with overlapping patterns. What are your thoughts on it?I also assume that King's Row is not using large-scale machine learning, so this is important: ML as an alternative rather than a must-have as was in Othello. It seems that one can use either maths or knowledge/intuition + a lot of testing; I find that interesting.
-
- Posts: 299
- Joined: Tue Jul 07, 2015 07:48
- Real name: Fabien Letouzey
Re: Internet engine matches
One big limit of supervised learning is that it ignores search/eval interaction. It's possible that reinforcement learning doesn't share this problem.Rein Halbersma wrote:I wonder what the limits of machine learning are. It seems ideally suited to exploit local patterns (in particular locks, perhaps breakthroughs), but global concepts (total terrain advancement, tempo development etc.) seem hard to fit with overlapping patterns. What are your thoughts on it?
Actually I think the main strength is in the little things: a higher-order PST. Patterns are not just approximations of the standard features. I make a connection with human vision.
Global concepts are out of reach of learning only if they are highly non-linear like left/right balance. Tempi for instance can be calculated exactly. An example of non-linear global concept would be position type.
Regarding how to improve it, there are many directions. Michel went on to use bigger patterns, and Michael Buro was also following that road in the GLEM paper. Using more non-linearity is another way, approaching ANNs in structure. See for instance Hannibal in Othello (http://satirist.org/learn-game/systems/ ... nibal.html). And then there's how to generate and weight the examples (in the supervised case). It has a big impact because statistical methods are very sensitive to correlation.
Re: Internet engine matches
If you look at the currect developments in machine learning and vision (with neural nets, you can identy that a dog of a certain breed is catching a frisbee), it is easily imaginable that global concepts can be learned as well. Whether (slow) complex neural nets it would work better than the simple but fast local patterns is an interesting question.Rein Halbersma wrote: I wonder what the limits of machine learning are. It seems ideally suited to exploit local patterns (in particular locks, perhaps breakthroughs), but global concepts (total terrain advancement, tempo development etc.) seem hard to fit with overlapping patterns. What are your thoughts on it?
Re: Internet engine matches
Im convinced that machine learning finally will surpass any form of human learning. And Im really glad that programs like Scan and Dragon showed us, that this is the way to move forward.I wonder what the limits of machine learning are. It seems ideally suited to exploit local patterns (in particular locks, perhaps breakthroughs), but global concepts (total terrain advancement, tempo development etc.) seem hard to fit with overlapping patterns. What are your thoughts on it?
In the beginning of Computer Draughts, initially, Truus (Stef keetman) and Flits (Adri Vermeulen) dominated this domain, and a little later Buggy (Nicolas Guibert). These 3 players had in common that next to programmer skills they were very good in Draughts (all 3 high ranked draughts players). Nowadays this is different, think Klaas Bor and Ton Tillemans are very good draughts players themselves, but it is not longer a boundary condition to create the best program.
So I really hope that others in Computer Draughts will follow on the ML path, and share their results and methods.
Anyway in my case, I will terminate any handtuning activities in the evaluation function of Damage, and will only focus on ML.
Bert
-
- Posts: 1368
- Joined: Thu Jun 20, 2013 17:16
- Real name: Krzysztof Grzelak
Re: Internet engine matches
Match KINGSROW - SCAN (3-move ballots - 988 games)
Kingsrow 1.56 vs. Scan 2.0 96 wins, 26 losses, 863 draws, 3 unknowns
Kingsrow
Opening Book = Best Moves
HashTable Size = 128 MB
DB cache Size = 10000 MB 6P
CPU 4 = 8 cores
Pondering = Off
Time = 5 Min / 80 Moves
Scan
Opening Book = book-margin = 0
DB cache Size = 2000 MB, 6P
CPU 4 = 8 cores
Pondering= Off
Time = 5 Min / 80 Moves
Match played on a computer with the equipment.
Processor - Intel Core I7 2670QM, 2.2GHz
Hard disc - SSD Samsung 840 Evo 1 TB
Memory of frames - 16 GB DDR3 1333
System - Windows 7 Home Premium 64 bit Service Pack 1 PL
It was the last test program Kingsrow and Scan. Thanks to Ed'a Gilberta and Fabiena Letouzey.
Kingsrow 1.56 vs. Scan 2.0 96 wins, 26 losses, 863 draws, 3 unknowns
Kingsrow
Opening Book = Best Moves
HashTable Size = 128 MB
DB cache Size = 10000 MB 6P
CPU 4 = 8 cores
Pondering = Off
Time = 5 Min / 80 Moves
Scan
Opening Book = book-margin = 0
DB cache Size = 2000 MB, 6P
CPU 4 = 8 cores
Pondering= Off
Time = 5 Min / 80 Moves
Match played on a computer with the equipment.
Processor - Intel Core I7 2670QM, 2.2GHz
Hard disc - SSD Samsung 840 Evo 1 TB
Memory of frames - 16 GB DDR3 1333
System - Windows 7 Home Premium 64 bit Service Pack 1 PL
It was the last test program Kingsrow and Scan. Thanks to Ed'a Gilberta and Fabiena Letouzey.
- Attachments
-
- dxpgames.pdn
- (1.02 MiB) Downloaded 265 times
-
- Posts: 859
- Joined: Sat Apr 28, 2007 14:53
- Real name: Ed Gilbert
- Location: Morristown, NJ USA
- Contact:
Re: Internet engine matches
Hi Krzysztof,
Thank you for posting these match results, but I think there might be some problem with your configuration of scan. I just now got scan running (none of my computers support popcount so I had to download VS2015 and recompile it using a software emulation for popcount). I'm getting results different than yours, with scan looking stronger than kingsrow. Can you post your scan.ini file here, and also check that you have all of its endgame db files? Maybe Fabien or others that are more familiar with scan can suggest other checks.
-- Ed
Thank you for posting these match results, but I think there might be some problem with your configuration of scan. I just now got scan running (none of my computers support popcount so I had to download VS2015 and recompile it using a software emulation for popcount). I'm getting results different than yours, with scan looking stronger than kingsrow. Can you post your scan.ini file here, and also check that you have all of its endgame db files? Maybe Fabien or others that are more familiar with scan can suggest other checks.
-- Ed
Re: Internet engine matches
Hi Ed, today I had success with compiling Mobydam for non-popcnt computers (the compile is shared in the Mobydam thread). Would you be so kind to share your non-popcnt compile of Scan with us?Ed Gilbert wrote: I just now got scan running (none of my computers support popcount so I had to download VS2015 and recompile it using a software emulation for popcount).
-- Ed
Regards,
Michael Taktikos
-
- Posts: 859
- Joined: Sat Apr 28, 2007 14:53
- Real name: Ed Gilbert
- Location: Morristown, NJ USA
- Contact:
Re: Internet engine matches
http://edgilbert.org/InternationalDraug ... ources.zip
New files and changed files (for Windows compile). Also add bitcount.cpp to the project.
New files and changed files (for Windows compile). Also add bitcount.cpp to the project.
-
- Posts: 1368
- Joined: Thu Jun 20, 2013 17:16
- Real name: Krzysztof Grzelak
Re: Internet engine matches
Ed am using 6 endgame db. File ini I have so saved.Ed Gilbert wrote:Hi Krzysztof,
Thank you for posting these match results, but I think there might be some problem with your configuration of scan. I just now got scan running (none of my computers support popcount so I had to download VS2015 and recompile it using a software emulation for popcount). I'm getting results different than yours, with scan looking stronger than kingsrow. Can you post your scan.ini file here, and also check that you have all of its endgame db files? Maybe Fabien or others that are more familiar with scan can suggest other checks.
-- Ed
book = true
book-margin = 0
threads = 4
tt-size = 6
bb-size = 6
dxp-server = true
dxp-host = 127.0.0.1
dxp-port = 27531
dxp-initiator = false
dxp-time = 5
dxp-moves = 80
dxp-board = true
dxp-search = true
-
- Posts: 859
- Joined: Sat Apr 28, 2007 14:53
- Real name: Ed Gilbert
- Location: Morristown, NJ USA
- Contact:
Re: Internet engine matches
To access the file I must first install FreeGamesZone on my PC, which I do not want to do. The scan.ini file should be just a few lines. Can you copy/paste to here?Krzysztof Grzelak wrote:Ed am using 6 endgame db. The file that you ask put below.
http://www83.zippyshare.com/v/eNG6DLYf/file.html
I am particularly interested to see the "threads =" entry. You wrote,
Does that mean that your scan.ini entry is "threads = 8"?CPU 4 = 8 cores
-- Ed
-
- Posts: 859
- Joined: Sat Apr 28, 2007 14:53
- Real name: Ed Gilbert
- Location: Morristown, NJ USA
- Contact:
Re: Internet engine matches
This is the problem. The transposition table is much too small. Change it tott-size = 6
tt-size = 24
and I think you will get much different results.
-- Ed
-
- Posts: 1368
- Joined: Thu Jun 20, 2013 17:16
- Real name: Krzysztof Grzelak
Re: Internet engine matches
Inscription CPU 4 = 8 cores means that the program uses CPU 4 total in your computer supports 8 cores. I think that when tt-size = 24 is too much. Fabien use that figure but use all 4 endgame db. Once I changed to 30 and the program for a long time thought the party ends.
-
- Posts: 859
- Joined: Sat Apr 28, 2007 14:53
- Real name: Ed Gilbert
- Location: Morristown, NJ USA
- Contact:
Re: Internet engine matches
The number of transposition table entries is 2^tt-size. That means 2 raised to the power of tt-size. If tt-size is 6, the number of entries is 2^6 = 64. If tt-size is 24, then number of entries is 2^24 = 16.7 million. Each tt entry is 16 bytes, so 16.7 million entries is 256mb. I don't know how scan behaves with different tt sizes, but clearly 6 is much too small. I would guess a value of 23 would be good for matches of 3 to 5 minutes. It's probably not critical (but 6 is very bad).
-- Ed
-- Ed