Computer tournament and tie break

Discussion about development of draughts in the time of computer and Internet.
Rein Halbersma
Posts: 1722
Joined: Wed Apr 14, 2004 16:04
Contact:

Post by Rein Halbersma » Sun May 31, 2009 20:11

Ed Gilbert wrote:
Putting aside this possibility to have additional games, do you first agree on the above criterias and their order?
The only one that looks odd to me is '2) The number of victories'. I say that because victories are often the result of a blunder by your opponent, which is more or less a random event, but defeats, unless due to a program crash such as in your case recently, have to be attributed to a weakness in the program, because the game of draughts is a draw in the absence of mistakes. So I think it is worse to have N+1 wins and 1 loss than N wins and no losses.

I have a question on a slightly different but related topic. Are there some written rules for these tournaments which describe the conditions under which the game clock can be paused, or a program can be restarted? I don't think I saw any for the Arleux tournament. If a program crashes, is it allowed to restart (I guess so as it was in your case), and does the program's clock continue to lose time during reboot and restart, or can it be paused? I'm sure that whatever happend in Arleux was reasonable, I just think it should be written somewhere so that everyone knows what should happen in these cases.

-- Ed
The ICGA tournaments have slightly different rules:
http://www.grappa.univ-lille3.fr/icga/e ... .php?id=20

The World Computer Chess Championships have these rules:
http://www.grappa.univ-lille3.fr/icga/e ... .php?id=12

Note that their tie-breaking rules are assuming a Swiss tournament rather than a round robin tournament.

Another mechanism to break ties (for 2 tied programs) would be to have 1 program play with the standard 25 minutes, and the other with X < 25 minutes, but a draw will suffice to win for the program with X minutes. Before the game, both players can alternating bid for the lowest X that they feel confident to draw with. Such a game would have quite a high probability to end in a decision. For multiple tied programs, it would work the same. When two programs meet in a tie-break, the one with the lowest bid will play with his bid on the clock, needing only a draw.

Ed Gilbert
Posts: 859
Joined: Sat Apr 28, 2007 14:53
Real name: Ed Gilbert
Location: Morristown, NJ USA
Contact:

Post by Ed Gilbert » Sun May 31, 2009 20:38

Hi Rein,

I found this ICGA rule interesting:

Code: Select all

An operator may only: [a] enter moves, [b] respond to a request from the computer for clock information, and [c] synchronize the computer clock to the normal chess clock. 
IMO this is an improvement over the rules that we have played with. We are not allowed to syncrohonize the computer clock to the table game clock. This means that the operator must overcompensate for operator time in order to guarantee not to forfeit a game on time. In my case, I usually play the first game with an overhead of 4sec/move, and then from there after I allow 3sec/move, under the assumption that I am more practiced after the first game. But 3 seconds is a lot of time. In a blitz game of say 5sec/move, with 3 seconds of overhead that only leaves 2 seconds for searching!

-- Ed

TAILLE
Posts: 968
Joined: Thu Apr 26, 2007 18:51
Location: FRANCE

Post by TAILLE » Sun May 31, 2009 23:32

Ed Gilbert wrote:The only one that looks odd to me is '2) The number of victories'. I say that because victories are often the result of a blunder by your opponent, which is more or less a random event, but defeats, unless due to a program crash such as in your case recently, have to be attributed to a weakness in the program, because the game of draughts is a draw in the absence of mistakes. So I think it is worse to have N+1 wins and 1 loss than N wins and no losses.
It is the first time I see that the number of victories could be seen as a disadvantage. That means that you consider that 2 draws is better than 1 win and 1 loss which is not so obvious. Anybody would prefer to see a player (computer or human) accepting a risk in order to try to avoid a draw and this is of course possible in computers. To keep this criteria 2 may help the programmers to accept to take such risks instead of playing only very solid moves, waiting patiently for an opponent mistake. It is for me a pity to delete this criteria but, if do not like if, of course I will delete it.
Are there other views on this point ?
Ed Gilbert wrote: I have a question on a slightly different but related topic. Are there some written rules for these tournaments which describe the conditions under which the game clock can be paused, or a program can be restarted? I don't think I saw any for the Arleux tournament. If a program crashes, is it allowed to restart (I guess so as it was in your case), and does the program's clock continue to lose time during reboot and restart, or can it be paused? I'm sure that whatever happend in Arleux was reasonable, I just think it should be written somewhere so that everyone knows what should happen in these cases.
-- Ed
I am in the same situation as yours. It is not quite clear for me what are the rules for our computer tournaments. That is the reason why we have to work between us on that point in order to build at least a common understanding of what we would like to see.
Gérard

Ed Gilbert
Posts: 859
Joined: Sat Apr 28, 2007 14:53
Real name: Ed Gilbert
Location: Morristown, NJ USA
Contact:

Post by Ed Gilbert » Mon Jun 01, 2009 02:55

It is for me a pity to delete this criteria but, if do not like if, of course I will delete it.
Of course it is only my opinion, which has no more value than anyone else's. But that 2) criteria seems at least controversial, where I don't see anything controversial about the others. I also would like to hear what other's think about 2).

-- Ed

Rein Halbersma
Posts: 1722
Joined: Wed Apr 14, 2004 16:04
Contact:

Post by Rein Halbersma » Mon Jun 01, 2009 09:44

Rein Halbersma wrote: Another mechanism to break ties (for 2 tied programs) would be to have 1 program play with the standard 25 minutes, and the other with X < 25 minutes, but a draw will suffice to win for the program with X minutes. Before the game, both players can alternating bid for the lowest X that they feel confident to draw with. Such a game would have quite a high probability to end in a decision. For multiple tied programs, it would work the same. When two programs meet in a tie-break, the one with the lowest bid will play with his bid on the clock, needing only a draw.
Not only do such imbalanced games have a high probability to be decisive, the tie-break itself will by construction lead to a winner! And they should be quite spectacular. What are your opinions on this proposal? I can't think of any objections so far.

TAILLE
Posts: 968
Joined: Thu Apr 26, 2007 18:51
Location: FRANCE

Post by TAILLE » Mon Jun 01, 2009 12:01

Ed Gilbert wrote:Of course it is only my opinion, which has no more value than anyone else's. But that 2) criteria seems at least controversial, where I don't see anything controversial about the others. I also would like to hear what other's think about 2).
-- Ed
By thinking a second time on this criteria 2) I now agree with you Ed. but for another reason. Suppose program A obtains 4 wins, n draws and 1 loss and program B obtains 3 wins and n+1 draws. We are sure to be able to design the winner when looking at the ...th best result which is either a win from A against a draw from B, or a draw from B against a loss from A.
As a consequence this criteria 2) is not really necessary.

updated list :
1) The number of points
2) The Sonneborn Berger
3) The best result
4) The second best result
5) The third best result
...
Gérard

TAILLE
Posts: 968
Joined: Thu Apr 26, 2007 18:51
Location: FRANCE

Post by TAILLE » Mon Jun 01, 2009 12:18

Rein Halbersma wrote:
Rein Halbersma wrote: Another mechanism to break ties (for 2 tied programs) would be to have 1 program play with the standard 25 minutes, and the other with X < 25 minutes, but a draw will suffice to win for the program with X minutes. Before the game, both players can alternating bid for the lowest X that they feel confident to draw with. Such a game would have quite a high probability to end in a decision. For multiple tied programs, it would work the same. When two programs meet in a tie-break, the one with the lowest bid will play with his bid on the clock, needing only a draw.
Not only do such imbalanced games have a high probability to be decisive, the tie-break itself will by construction lead to a winner! And they should be quite spectacular. What are your opinions on this proposal? I can't think of any objections so far.
As far as I am concerned I do not like this proposal. For the following reasons : between 2 strong programs A and B, A having 25 minutes and B 0 minutes, my feeling is that the probability for A to win is less than 50% because you have te remember that B will, in any case, use the time consumed by program A. As a consequence the winner of the toss will be the operator which is very confident on his speed in manipulation.
As such it is not satisfactory for me.
Gérard

Rein Halbersma
Posts: 1722
Joined: Wed Apr 14, 2004 16:04
Contact:

Post by Rein Halbersma » Mon Jun 01, 2009 16:21

TAILLE wrote:
Rein Halbersma wrote:
Rein Halbersma wrote: Another mechanism to break ties (for 2 tied programs) would be to have 1 program play with the standard 25 minutes, and the other with X < 25 minutes, but a draw will suffice to win for the program with X minutes. Before the game, both players can alternating bid for the lowest X that they feel confident to draw with. Such a game would have quite a high probability to end in a decision. For multiple tied programs, it would work the same. When two programs meet in a tie-break, the one with the lowest bid will play with his bid on the clock, needing only a draw.
Not only do such imbalanced games have a high probability to be decisive, the tie-break itself will by construction lead to a winner! And they should be quite spectacular. What are your opinions on this proposal? I can't think of any objections so far.
As far as I am concerned I do not like this proposal. For the following reasons : between 2 strong programs A and B, A having 25 minutes and B 0 minutes, my feeling is that the probability for A to win is less than 50% because you have te remember that B will, in any case, use the time consumed by program A. As a consequence the winner of the toss will be the operator which is very confident on his speed in manipulation.
As such it is not satisfactory for me.
Let's test your feeling. I played the following game: Kingsrow (white) with 1 seconds per move thinking, pondering on, against Truus with 25 minutes for 75 moves. Here is the result:

[Event ""]
[Date "2009.6.1"]
[White "Kingsrow"]
[Black "Truus"]
[Result "0-1"]
1. 32-28 18-23 2. 34-29 23x34 3. 39x30 20-25 4. 44-39 25x34 5. 39x30 14-20 6. 50-44 10-14 7. 37-32 20-25 8. 41-37 25x34 9. 40x29 19-24 10. 29x20 15x24 11. 31-27 5-10 12. 43-39 17-21 13. 45-40 12-18 14. 37-31 14-19 15. 46-41 21-26 16. 41-37 7-12 17. 40-34 10-15 18. 27-22 18x27 19. 31x22 1-7 20. 36-31 12-18 21. 31-27 7-12 22. 34-29 11-17 23. 22x11 16x7 24. 29x20 15x24 25. 39-34 18-23 26. 44-39 9-14 27. 37-31 26x37 28. 42x31 14-20 29. 27-21 20-25 30. 31-27 7-11 31. 21-16 11-17 32. 49-43 13-18 33. 34-30 25x34 34. 39x30 2-7 35. 43-39 8-13 36. 39-34 4-9 37. 48-43 9-14 38. 43-39 3-8 39. 47-42 17-21 40. 42-37 21-26 41. 30-25 7-11 42. 16x7 12x1 43. 34-30 8-12 44. 39-34 1-7 45. 34-29 23x34 46. 30x39 18-23 47. 28-22 6-11 48. 33-28 12-18 49. 39-34 7-12 50. 38-33 12-17 51. 34-30 11-16 52. 22x11 16x7 53. 28-22 7-11 0-1

<img src="http://fmjd.org/dias2/save/12438657831.png">

Of course, it's not a proof that my proposal is good (perhaps Ed or Bert can play some automated matches with these skewed time controls), but it shows that it is quite hard to get an easy draw when you go for 0 minutes. And it took Kingsrow 5 minutes of time to do all the moves (although I had to operate both Truus and Kingsrow).

Ed Gilbert
Posts: 859
Joined: Sat Apr 28, 2007 14:53
Real name: Ed Gilbert
Location: Morristown, NJ USA
Contact:

Post by Ed Gilbert » Mon Jun 01, 2009 17:49

I played the following game: Kingsrow (white) with 1 seconds per move thinking, pondering on, against Truus with 25 minutes for 75 moves.
A problem with this test is that kingsrow presently does not ponder for an indefinite period of time until the opponent makes a move. It only ponders for an average time that is the same as its search time setting. This is something that I can easily improve. I think Gerard is right that if it ponders for the full time of the opponent's search then it will be impossible to get the probability of a win up to .5. I also do not like the aspect to this idea that it seems to be testing your ability to guess a correct search time as much as the ability of the engine. And it also would be testing your ability to be really fast at entering moves with the mouse and making moves on the table board in order to not end up forfeiting the game on time.

-- Ed

TAILLE
Posts: 968
Joined: Thu Apr 26, 2007 18:51
Location: FRANCE

Post by TAILLE » Mon Jun 01, 2009 23:14

Hi,

1) The number of points
2) The Sonneborn Berger
3) The best result
4) The second best result
5) The third best result
...[/quote]

Of course I put the criteria "2) The Sonneborn Berger" in the list because this criteria was effectively used in Arleux.
If however we are happy with the 3rd, 4th ... criterias, why taking into account the Sonneborn Berger ?
If the Sonneborn Berger of the programs in competition for the first place are differnet, doesn't that mean that the criterias 3, 4, ... will be able to decide who is the winner ?
If this is true then I propose to delete this criteria and to keep only 2 criterias :
1) The number of points
2) The best differentiating result
That way we have a very simple view of what is needed to be the winner of a tie break : you have to obtain a win against the strongest possible program!
Gérard

Ed Gilbert
Posts: 859
Joined: Sat Apr 28, 2007 14:53
Real name: Ed Gilbert
Location: Morristown, NJ USA
Contact:

Post by Ed Gilbert » Tue Jun 02, 2009 01:32

Hi Gerard,

I prefer the SB score as the primary tie-break method. It is a more comprehensive approach that takes into account the strength of all the opponents and how you did against each one, as opposed to just a single game like the best result. IMO the best result is ok as a second order tie break, but only because it will seldom be used.

-- Ed

TAILLE
Posts: 968
Joined: Thu Apr 26, 2007 18:51
Location: FRANCE

Post by TAILLE » Tue Jun 02, 2009 12:21

Ed Gilbert wrote:Hi Gerard,

I prefer the SB score as the primary tie-break method. It is a more comprehensive approach that takes into account the strength of all the opponents and how you did against each one, as opposed to just a single game like the best result. IMO the best result is ok as a second order tie break, but only because it will seldom be used.

-- Ed
OK Ed.

So far the result of the discussion is the following :

Firstly we propose two double the number of games (two games per opponent) and secondly we propose to add a new criteria (3) :

1) The number of points
2) The Sonneborn Berger
3) The best differentiating result (after having classiffied all programs according to the first two criterias)

It remains one important point : do we want to add additional games (one, two, three ?) as a criteria (I guess between the first and the second above) or do we consider that it is not so interesting because the probability to obtain draw games is too high.

May be we have also a rather minor point (because highly improbable) to solve : if it happens that the tie break remains after these 3 criterias I proposed that the winner could be the program with the worse rating or the youngest programmer or ?
Gérard

Ed Gilbert
Posts: 859
Joined: Sat Apr 28, 2007 14:53
Real name: Ed Gilbert
Location: Morristown, NJ USA
Contact:

Post by Ed Gilbert » Tue Jun 02, 2009 14:32

It remains one important point : do we want to add additional games (one, two, three ?) as a criteria (I guess between the first and the second above) or do we consider that it is not so interesting because the probability to obtain draw games is too high.

May be we have also a rather minor point (because highly improbable) to solve : if it happens that the tie break remains after these 3 criterias I proposed that the winner could be the program with the worse rating or the youngest programmer or ?
I do not have a strong preference for any of the choices. I am ok with declaring multiple winners, or even some of the more whimsical suggestions like youngest, lowest rating, etc. About additional games, I suspect the odds of getting a win in a normal game between 2 fairly evenly matched opponents is less than 10%. Going to blitz time controls might increase these odds, but I don't like blitz because it is as much a test of your hand-eye coordination as it is of your engine. I normally allow about 3 seconds of operator time for each move. In a blitz game of 5sec/move, that leaves 2 seconds per move for searching. If I want to be daring and try to accomplish all the operator stuff in 2sec/move, then I can increase my search time by 50%, but at the risk of possibly losing by running out of time. IMO we should not be testing operator skills, only engines. Probably my first preference for a tie break would be a pair of games with a lopsided start position -- but not an endgame or midgame position, only an opening position with 40 or nearly 40 men on the board. But as I said I don't have a strong preference for any of these. I would like to hear some opinions of the other programmers.

-- Ed

User avatar
FeikeBoomstra
Posts: 306
Joined: Mon Dec 19, 2005 16:48
Location: Emmen

Post by FeikeBoomstra » Tue Jun 02, 2009 16:52

TAILLE wrote:
For me it exists two very different competitions :
1) Match betweens 2 computers
2) Tournament with a lot of computers

In the first case (match) the idea is to know which of the two computers is the best. We need for that a lot of games; most of them will be concluded be a draw and an automatic way of playing these games is necessary. As a consequence these matches can be played with a remote access between the two computers and the two programmers do not need to meet physically.

In the second case (tournament) the idea is more like a spectacle with the opportunity for the weaker programms to compete against the best ones, the opportunity for a spectator to see a lot of winning games (comparing to a match) and the opportunity for the programmers to meet physically. The winner is not necessaraly the "real" best program, it is simply the program which have the maximum wins against the weaker programs. As a consequence the chance is highly present and the winner will often change from one tournament to the other.

In this spirit the last french open tournament comfirmed that 4 programs (TDKing, Damy, Kingsrow and Damage) dominate the draughts world but of course nobody will deduce which is the best one.

Coming back to the tie break problem I think it should be also a spectacle. We cannot ignore the number of people that were interested and came to see this tie break.
I do not like at all the approach with a miniature against a human and I would not be happy to continue that way, but, at least, the spectators where happy with this spectacle.
My view is that we have to do our best in order that our tounaments can continue to be viewed as a spectacle. Automatic matches with rapid games between two computers cannot be a spectacle and, in this sense, will decrease the interest of a tournament for the spectators.

In conclusion my view is the following : let's organise matches between computers during the year and let's try to organise tournaments as a spectacle during which we could see a lot of wins by the top programs against the weakers ones.

but we have to keep in mind the spirit of a tournament (in opposition of the spirit of a match).
I agree, with a tournament it is important that in case of a tie, there is a spectacle.
I also agree, that operator skills should not be the deciding factor.

But why not ignore the operator time. We have to trust the engine's own judgment about the time spent, but to me, that's not a big deal. So only the time the engine is running (in its own time, not the pondering time) is counted for, and the operator can take his time to copy the moves without errors.

Kind regards,

Feike.

TAILLE
Posts: 968
Joined: Thu Apr 26, 2007 18:51
Location: FRANCE

Post by TAILLE » Tue Jun 02, 2009 19:29

Hi Ed.
Ed Gilbert wrote:Hi Gerard,

I prefer the SB score as the primary tie-break method. It is a more comprehensive approach that takes into account the strength of all the opponents and how you did against each one, as opposed to just a single game like the best result. IMO the best result is ok as a second order tie break, but only because it will seldom be used.

-- Ed
I am not sure to really understand the meaning of the SB.
Let's take 2 programs classified near the middle of the tournament and let's suppose that these two programs realize each 9 draws and one loss.
Program A lost against the best program of the tournament and program B lost against the weakest program. According to SB there are no hesitation, program B is far better than A. Is it really the common sense ? It is a little strange to decide that it is better lo lose against the weakest program rather than against the strongest one but in the other hand to manage to obtain a draw against the best program may also be a good performance. As far as I am concerned I am really hesitating.

BTW the best differentiating result chooses also the program B!

Remenbering that you consider that a loss against a weak program is certainly not a good sign of strength what is your feeling Ed. ?
Gérard

Post Reply