Self-learning and monte-carlo Algorithm
Self-learning and monte-carlo Algorithm
Hi all,
recently 4th december 2017, Google's firm, deepmind headquarter done a match that opposed their New chess engine called Alphazero against Stockfisch the best chess.
100 games was played and Alphazero won 25 wins 72 draws 0 losses.
Self-learning of 4 hours was enough for Alphazero to master chess game (without opening book and endgame)and crush the best chess engine.
I know that only Windames,Plus500 and Aurora Borealis are this option.
Self-learning ou teaching can be excellent to improve draughts game or solve it?
Itsn't a kind of endgame generator?
https://chess24.com/en/read/news/deepmi ... shes-chess#
Sidiki--
recently 4th december 2017, Google's firm, deepmind headquarter done a match that opposed their New chess engine called Alphazero against Stockfisch the best chess.
100 games was played and Alphazero won 25 wins 72 draws 0 losses.
Self-learning of 4 hours was enough for Alphazero to master chess game (without opening book and endgame)and crush the best chess engine.
I know that only Windames,Plus500 and Aurora Borealis are this option.
Self-learning ou teaching can be excellent to improve draughts game or solve it?
Itsn't a kind of endgame generator?
https://chess24.com/en/read/news/deepmi ... shes-chess#
Sidiki--
-
- Posts: 299
- Joined: Tue Jul 07, 2015 07:48
- Real name: Fabien Letouzey
Re: Self-learning and monte-carlo Algorithm
Hi Sidiki,
First of all, the AlphaZero learning duration was 9 hours (on thousands of enhanced computers), not 4 as everybody is copy/pasting. It is claimed that AlphaZero *reached* the level of Stockfish after 4 hours, but that's not the version that played the match.
Secondly, machine learning of evaluation (as in Scan and AlphaZero, although they differ a lot) has nothing to do with the good old opening-book learning that you are mentioning. The latter only recognises positions it has seen before. So it's basically a search tree (analysis) stored on disk.
Endgame-table generators are a form of exhaustive search: they make sure every possible position (with a specific material signature) is looked at. Again, nothing to do with evaluation (or learning).
Fabien.
There is so much confusion (not just from you).Sidiki wrote:recently 4th december 2017, Google's firm, deepmind headquarter done a match that opposed their New chess engine called Alphazero against Stockfisch the best chess.
100 games was played and Alphazero won 25 wins 72 draws 0 losses.
Self-learning of 4 hours was enough for Alphazero to master chess game (without opening book and endgame)and crush the best chess engine.
I know that only Windames,Plus500 and Aurora Borealis are this option.
Self-learning ou teaching can be excellent to improve draughts game or solve it?
Itsn't a kind of endgame generator?
First of all, the AlphaZero learning duration was 9 hours (on thousands of enhanced computers), not 4 as everybody is copy/pasting. It is claimed that AlphaZero *reached* the level of Stockfish after 4 hours, but that's not the version that played the match.
Secondly, machine learning of evaluation (as in Scan and AlphaZero, although they differ a lot) has nothing to do with the good old opening-book learning that you are mentioning. The latter only recognises positions it has seen before. So it's basically a search tree (analysis) stored on disk.
Endgame-table generators are a form of exhaustive search: they make sure every possible position (with a specific material signature) is looked at. Again, nothing to do with evaluation (or learning).
Fabien.
Re: Self-learning and monte-carlo Algorithm
Fabyen Congratulations I did not know you are a great programmer also in very good chess. Luzimar AraujoFabien Letouzey wrote:Hi Sidiki,
There is so much confusion (not just from you).Sidiki wrote:recently 4th december 2017, Google's firm, deepmind headquarter done a match that opposed their New chess engine called Alphazero against Stockfisch the best chess.
100 games was played and Alphazero won 25 wins 72 draws 0 losses.
Self-learning of 4 hours was enough for Alphazero to master chess game (without opening book and endgame)and crush the best chess engine.
I know that only Windames,Plus500 and Aurora Borealis are this option.
Self-learning ou teaching can be excellent to improve draughts game or solve it?
Itsn't a kind of endgame generator?
First of all, the AlphaZero learning duration was 9 hours (on thousands of enhanced computers), not 4 as everybody is copy/pasting. It is claimed that AlphaZero *reached* the level of Stockfish after 4 hours, but that's not the version that played the match.
Secondly, machine learning of evaluation (as in Scan and AlphaZero, although they differ a lot) has nothing to do with the good old opening-book learning that you are mentioning. The latter only recognises positions it has seen before. So it's basically a search tree (analysis) stored on disk.
Endgame-table generators are a form of exhaustive search: they make sure every possible position (with a specific material signature) is looked at. Again, nothing to do with evaluation (or learning).
Fabien.
Re: Self-learning and monte-carlo Algorithm
Hi Fabien,Luzimar wrote:Fabyen Congratulations I did not know you are a great programmer also in very good chess. Luzimar AraujoFabien Letouzey wrote:Hi Sidiki,
There is so much confusion (not just from you).Sidiki wrote:recently 4th december 2017, Google's firm, deepmind headquarter done a match that opposed their New chess engine called Alphazero against Stockfisch the best chess.
100 games was played and Alphazero won 25 wins 72 draws 0 losses.
Self-learning of 4 hours was enough for Alphazero to master chess game (without opening book and endgame)and crush the best chess engine.
I know that only Windames,Plus500 and Aurora Borealis are this option.
Self-learning ou teaching can be excellent to improve draughts game or solve it?
Itsn't a kind of endgame generator?
First of all, the AlphaZero learning duration was 9 hours (on thousands of enhanced computers), not 4 as everybody is copy/pasting. It is claimed that AlphaZero *reached* the level of Stockfish after 4 hours, but that's not the version that played the match.
Secondly, machine learning of evaluation (as in Scan and AlphaZero, although they differ a lot) has nothing to do with the good old opening-book learning that you are mentioning. The latter only recognises positions it has seen before. So it's basically a search tree (analysis) stored on disk.
Endgame-table generators are a form of exhaustive search: they make sure every possible position (with a specific material signature) is looked at. Again, nothing to do with evaluation (or learning).
Fabien.
Thank for all these precisions. So the truth it's else that what it's claimed on many websites and blog.
-
- Posts: 299
- Joined: Tue Jul 07, 2015 07:48
- Real name: Fabien Letouzey
Re: Self-learning and monte-carlo Algorithm
If you have doubts, you can post some links and I will have a look.Sidiki wrote:Thank for all these precisions. So the truth it's else that what it's claimed on many websites and blog.
Some of the confusion is understandable. "learning" is a vague term: it's supposed to describe programs that improve with time. For example, a learning program might get better at solving combinations the more you use it. By contrast, just computing something is not "learning" in itself; you seem to be suggesting that.
The modern variant, usually called "machine learning", goes well beyond rote learning used to remember positions. It's usually used "offline", which means that the programmer runs the learning once during development. And then you get the resulting program, which doesn't learn afterwards (that would be both complicated and pointless). That's why the "learning" functionality doesn't appear; it's already been used by the author.
Re: Self-learning and monte-carlo Algorithm
Hi Fabien,Fabien Letouzey wrote:If you have doubts, you can post some links and I will have a look.Sidiki wrote:Thank for all these precisions. So the truth it's else that what it's claimed on many websites and blog.
Some of the confusion is understandable. "learning" is a vague term: it's supposed to describe programs that improve with time. For example, a learning program might get better at solving combinations the more you use it. By contrast, just computing something is not "learning" in itself; you seem to be suggesting that.
The modern variant, usually called "machine learning", goes well beyond rote learning used to remember positions. It's usually used "offline", which means that the programmer runs the learning once during development. And then you get the resulting program, which doesn't learn afterwards (that would be both complicated and pointless). That's why the "learning" functionality doesn't appear; it's already been used by the author.
I remember that in one of yours posts on Scan 2.0, you wrote that learning was in the prepare step of the program and after, ie, when the program plays, this option itsn't longer availible. So what it's the truth into this Alphazero story.
Perhaps that it has a very very huge database due to learning or has a revolutionary eval function. They said that it's Montecarlo, that is based on a deep search.
Sidiki
Re: Self-learning and monte-carlo Algorithm
Hi Sidiki,Sidiki wrote:Hi Fabien,Fabien Letouzey wrote:If you have doubts, you can post some links and I will have a look.Sidiki wrote:Thank for all these precisions. So the truth it's else that what it's claimed on many websites and blog.
Some of the confusion is understandable. "learning" is a vague term: it's supposed to describe programs that improve with time. For example, a learning program might get better at solving combinations the more you use it. By contrast, just computing something is not "learning" in itself; you seem to be suggesting that.
The modern variant, usually called "machine learning", goes well beyond rote learning used to remember positions. It's usually used "offline", which means that the programmer runs the learning once during development. And then you get the resulting program, which doesn't learn afterwards (that would be both complicated and pointless). That's why the "learning" functionality doesn't appear; it's already been used by the author.
I remember that in one of yours posts on Scan 2.0, you wrote that learning was in the prepare step of the program and after, ie, when the program plays, this option itsn't longer availible. So what it's the truth into this Alphazero story.
Perhaps that it has a very very huge database due to learning or has a revolutionary eval function. They said that it's Montecarlo, that is based on a deep search.
Sidiki
I do not understand your question.
What is the problem of using Montecarlo as search algorithm during a game?
Using Montecarlo does not mean your are in a learning process and BTW in the past I experimented a little this algorithm in Damy as search algorithm.
Gérard
Re: Self-learning and monte-carlo Algorithm
Hi Gerard,TAILLE wrote:Hi Sidiki,Sidiki wrote:Hi Fabien,Fabien Letouzey wrote: If you have doubts, you can post some links and I will have a look.
Some of the confusion is understandable. "learning" is a vague term: it's supposed to describe programs that improve with time. For example, a learning program might get better at solving combinations the more you use it. By contrast, just computing something is not "learning" in itself; you seem to be suggesting that.
The modern variant, usually called "machine learning", goes well beyond rote learning used to remember positions. It's usually used "offline", which means that the programmer runs the learning once during development. And then you get the resulting program, which doesn't learn afterwards (that would be both complicated and pointless). That's why the "learning" functionality doesn't appear; it's already been used by the author.
I remember that in one of yours posts on Scan 2.0, you wrote that learning was in the prepare step of the program and after, ie, when the program plays, this option itsn't longer availible. So what it's the truth into this Alphazero story.
Perhaps that it has a very very huge database due to learning or has a revolutionary eval function. They said that it's Montecarlo, that is based on a deep search.
Sidiki
I do not understand your question.
What is the problem of using Montecarlo as search algorithm during a game?
Using Montecarlo does not mean your are in a learning process and BTW in the past I experimented a little this algorithm in Damy as search algorithm.
My question was, and i can say that it's most a hypotese that a question, perhaps that Alphazero use a very huge learning result Database.
I just precise that they said that it eval function is Montecarlo.
-
- Posts: 221
- Joined: Thu Nov 27, 2008 19:22
- Contact:
Re: Self-learning and monte-carlo Algorithm
I don't think AlphaZero's search should be called Monte Carlo; It's selecting moves in the search tree based on the advise of the evaluation function, so it's a deliberate way of pruning. This is I think the main innovation of AlphaZero, but it is hard to tell how this impacts performance.Sidiki wrote:Hi Gerard,TAILLE wrote:Hi Sidiki,Sidiki wrote:
Hi Fabien,
I remember that in one of yours posts on Scan 2.0, you wrote that learning was in the prepare step of the program and after, ie, when the program plays, this option itsn't longer availible. So what it's the truth into this Alphazero story.
Perhaps that it has a very very huge database due to learning or has a revolutionary eval function. They said that it's Montecarlo, that is based on a deep search.
Sidiki
I do not understand your question.
What is the problem of using Montecarlo as search algorithm during a game?
Using Montecarlo does not mean your are in a learning process and BTW in the past I experimented a little this algorithm in Damy as search algorithm.
My question was, and i can say that it's most a hypotese that a question, perhaps that Alphazero use a very huge learning result Database.
I just precise that they said that it eval function is Montecarlo.
The main power of AlphaZero, besides computational power and setting the match conditions, seems to be the massive neural net it uses for evaluation. I don't believe it is using a database in playing.
AlphaZero's publicity is absolutely fantastic.
http://slagzet.com
-
- Posts: 20
- Joined: Mon Oct 17, 2016 09:05
- Real name: Robin Messemer
Re: Self-learning and monte-carlo Algorithm
Having looked at the alphaZero/alphaGoZero papers, one shouldnt call the algorithm mcts because there are no longer random playouts at leaf nodes.