PDN standard

Rein Halbersma · Post by **Rein Halbersma** » Mon May 11, 2009 10:20

Wieger Wesselink wrote:* Moves can now have a 'Nag' annotation. This is adopted from PGN. It is an annotation like $1, $2, ... I don't know exactly what the purpose of it is, but in TurboDambase they are actually used.

From the PGN standard:

Code: Select all

8.2.4: Movetext NAG (Numeric Annotation Glyph)
An NAG (Numeric Annotation Glyph) is a movetext element that is used to indicate a simple annotation in a language independent manner. An NAG is formed from a dollar sign ("$") with a non-negative decimal integer suffix. The non-negative integer must be from zero to 255 in value.

http://www.saremba.de/chessgml/standard ... te.htm#c10

Code: Select all

NAG Interpretation 
0 null annotation 
1 good move (traditional "!") 
2 poor move (traditional "?") 
3 very good move (traditional "!!") 
4 very poor move (traditional "??") 
5 speculative move (traditional "!?") 
6 questionable move (traditional "?!") 
7 forced move (all others lose quickly) 
8 singular move (no reasonable alternatives) 
9 worst move

Codes 10 and higher are reserved for annotations about the position rather than the last played move.

Rein Halbersma · Post by **Rein Halbersma** » Mon May 11, 2009 10:28

Wieger Wesselink wrote:
I have never seen a colon used as a move separator though.
That's what I have seen Russian players do. But it can be easily removed from the standard.

http://aurora.shashki.com/images/pic/wmain.gif

They use ":" to designate captures. A forgiving parser would take as a valid move definition any two squares numbers separated by any single non-whitespace character. Currently used conventions are "-", "x", "*", ":" or even no space at all. E.g. my FEN parser takes any sequence of square numbers with an arbitrary number of whitespace or non-numeric characters without hiccups. There is no real need to be so strict in the standard.

Rein Halbersma · Post by **Rein Halbersma** » Mon May 11, 2009 10:40

Wieger Wesselink wrote:Something that also needs to be discussed is extensions to this standard. How should clock times be added to a game? This is useful for games played on an electronic board and for games between computers. How should time controls and times used by the players be recorded? Finally it would be nice to have a possibility to do a setup of a new position at arbitrary points. This helps to store analyses of games, and it also makes it possible to store games with illegal moves.

http://digitalgametechnology.com/site/i ... nsion.html
Example for including time lapse and eval score into a variation comment
{[%clk 0:00:07][%eval -6.05] White is toast}

Time controls are discussed in the PGN standard, section 9.6
http://www.saremba.de/chessgml/standard ... e.htm#c9.6

Rein Halbersma · Post by **Rein Halbersma** » Mon May 11, 2009 10:54

Ed Gilbert wrote:Regarding extensions like accepting 349 for 34-29: I don't think most existing parsers accept that. What has been proposed so far does not break any programs that I am familiar with. Most programs already consider the results codes following the list of moves as optional, and consider the header tags of the following game as a terminator for the previous game. So what has been proposed only confirms that as an accepted practice and would tell new programs to not write those result codes. But if you extend the standard in ways that break existing programs then I think there has to be a good reason for it. In the spirit of writing exactly, I don't think the standard should say that 34-29 should be written as 349, so we're only talking about accepting that during reading. But if we add that to the new pdn standard and then you create a pdn file with the move written as 349, that file can only be read by parsers that have been updated to conform to the new standard, and it will not be read correctly by any legacy draughts programs. There will always be some legacy programs that never update to a new standard, and new programs will never write moves in that format, so I think in the spirit of not gatuitously breaking legacy programs we should not include that kind of extension to the standard.

-- Ed

I am not proposing to break legacy programs. I think the only community that will want to be able to read and write short numeric notation (349 instead of 34-29) are problem composers. If a PDN file has a tag Notation (see Adrian Millet's proposal) then it is *up to the user* to supply such a PDN to a program that supports this short notation. Currently there are no such programs but they might be developed.

I am perfectly happy with a "reduced export format" that mandates the use of long numeric notation and whatever it takes not to break legacy programs. But the standard could easily be extended to allow for other formats. That is OK as long as *the user* knows this and adapts around it by only supplying such other formats (Russian draughts games, problem variations etc.) to programs that are explicit in excepting this. There is no need to be rigid in the standard. It should reflect current and plausible future usage.

Piet Bouma · Post by **Piet Bouma** » Mon May 11, 2009 13:07

Rein Halbersma wrote:
Wieger Wesselink wrote:
I have never seen a colon used as a move separator though.
That's what I have seen Russian players do. But it can be easily removed from the standard.

http://aurora.shashki.com/images/pic/wmain.gif

They use ":" to designate captures. A forgiving parser would take as a valid move definition any two squares numbers separated by any single non-whitespace character. Currently used conventions are "-", "x", "*", ":" or even no space at all. E.g. my FEN parser takes any sequence of square numbers with an arbitrary number of whitespace or non-numeric characters without hiccups. There is no real need to be so strict in the standard.

A few years ago I downloaded a evaluation version of the Aurora database.
In downloads from the programm of PDN-files the move separator is "-" or "x".
So I see no need to add a colon as move separator.
Display on Internet can people do anyway they want, but why add another separation character if programms don't use this?

For example: http://www.draughts.ru/beijing/games/men1round.htm

Scroll the left frame downwards, Then you can download the .pdn file and: "-" and "x".

Wieger Wesselink · Post by **Wieger Wesselink** » Mon May 11, 2009 23:36

I have made an EBNF for a 'forgiving' parser, see below. This one accepts moves with spaces (one before and/or after the separator). It also accepts moves in chess notation. One more difference with respect to the previous version is that the Nag annotations are at the same level as variations and comments. Apparently that is the way they are used in TurboDambase. With this grammar I have tested a lot of pdn files that I found on the internet. All of them pass, except for the ones with real errors or with 0-2, 1-1 or 2-0 as game separators. The 1-1 result is what causes a real problem. It is a prefix of a move like 1-12, and therefore it requires a non-trivial parser to deal with that case. This does not seem acceptable to me, so I left those results out.

Perhaps we can try to agree on a 'forgiving' EBNF that can be used for input of a PDN file and a 'strict' EBNF that has to be used for output of a PDN file. I suggest that the strict EBNF does not allow spaces in moves, and does only use the * as a game separator.

Code: Select all

separator space: '\s+' ;

token Win:           '1-0'                                
token Draw:          '1/2-1/2'                            
token Loss:          '0-1'                                
token NoResult:      '\*'                                 
token NumericMove:   '\d+(\s?[-x]\s?\d+)+[*?!]?'          
token ChessMove:     '[a-h][1-8]([-x][a-h][1-8])+[*?!]?'  
token Identifier:    '[a-zA-Z]\w*'                        
token String:        '"[^"]*"'                            
token Comment:       '{[^}]*}'                            
token MoveNumber:    '\d+\.(\.\.)?'                       
token Nag:           '\$\d+'                              
                                                          
PDNFile       -> Game (GameSeparator Game)* GameSeparator?
GameSeparator -> Win | Draw | Loss | NoResult             
Game          -> (GameHeader GameBody?) | GameBody        
GameHeader    -> (Tag)+                                   
Tag           -> '\[' Identifier String '\]'                                                                                               
GameMove      -> MoveNumber? (NumericMove | ChessMove)    
GameBody      -> Annotation? (GameMove Annotation?)+      
Variation     -> '\(' GameBody '\)'                                               
Annotation    -> (Variation | Comment | Nag)+

FeikeBoomstra · Post by **FeikeBoomstra** » Mon May 11, 2009 23:52

Can't it be:

GameBody -> Annotation? (GameMove Annotation?)*

Piet Bouma · Post by **Piet Bouma** » Tue May 12, 2009 00:05

Wieger Wesselink wrote:I have made an EBNF for a 'forgiving' parser, see below. This one accepts moves with spaces (one before and/or after the separator). It also accepts moves in chess notation. One more difference with respect to the previous version is that the Nag annotations are at the same level as variations and comments. Apparently that is the way they are used in TurboDambase. With this grammar I have tested a lot of pdn files that I found on the internet. All of them pass, except for the ones with real errors or with 0-2, 1-1 or 2-0 as game separators. The 1-1 result is what causes a real problem. It is a prefix of a move like 1-12, and therefore it requires a non-trivial parser to deal with that case. This does not seem acceptable to me, so I left those results out.

Perhaps we can try to agree on a 'forgiving' EBNF that can be used for input of a PDN file and a 'strict' EBNF that has to be used for output of a PDN file. I suggest that the strict EBNF does not allow spaces in moves, and does only use the * as a game separator.
Code: Select all
separator space: '\s+' ;

token Win:           '1-0'                                
token Draw:          '1/2-1/2'                            
token Loss:          '0-1'                                
token NoResult:      '\*'                                 
token NumericMove:   '\d+(\s?[-x]\s?\d+)+[*?!]?'          
token ChessMove:     '[a-h][1-8]([-x][a-h][1-8])+[*?!]?'  
token Identifier:    '[a-zA-Z]\w*'                        
token String:        '"[^"]*"'                            
token Comment:       '{[^}]*}'                            
token MoveNumber:    '\d+\.(\.\.)?'                       
token Nag:           '\$\d+'                              
                                                          
PDNFile       -> Game (GameSeparator Game)* GameSeparator?
GameSeparator -> Win | Draw | Loss | NoResult             
Game          -> (GameHeader GameBody?) | GameBody        
GameHeader    -> (Tag)+                                   
Tag           -> '\[' Identifier String '\]'                                                                                               
GameMove      -> MoveNumber? (NumericMove | ChessMove)    
GameBody      -> Annotation? (GameMove Annotation?)+      
Variation     -> '$' GameBody '$'                                               
Annotation    -> (Variation | Comment | Nag)+             

Wieger, did you also try the output of a .pdn file from <a href=http://toernooibase.kndb.nl target=blank>Toernooibase</a>.
For example: http://toernooibase.kndb.nl/applet/appl ... 3&r=2&jr=8

I use here the "Delfts results". Also as result at the end of the .pdn file.
Does it mean, that I have to replace that with 1/2-1/2 ?? (or better "*")
Entry Dambo accepts this results!!
(Of course I can make the separator *, but why can't other results not be 'forgiven'. If you make the tag [Event] absolutely as the separator - start of a new game -, you can filter all different results quite easily, from the string before)

Wieger Wesselink · Post by **Wieger Wesselink** » Tue May 12, 2009 00:10

Rein Halbersma wrote:Perhaps we can structure the discussion a bit more. IMO, it might be a good idea to first come to a broad consensus on the functional requirements that the new PDN standard needs to satisfy. Only then should we look for a clean grammar.

Hi Rein, thanks for your clarifications and for bringing these issues into the discussion. I am primarily interested in getting a formal definition of valid PDN. But it is good to take a broader view as well.

I think it is important to appreciate the universality of this goal: PDN is not just for programmers, but also for players to exchange games. Readability is therefore not just eye candy but an ingredient that will facilitate standard compliance.

Regarding this, FEN isn't so bad. It's quite easy for humans to use a FEN string to setup a position on the board.

Functional requirements
---------------------------

1) Encompass all known and future draughts variants.
[...]
2) Backwards read-compatibility for all existing PDN archives on the web.
[...]
3) Backwards write-compatibility to the most frequently used programs on the web.
[...]
4) Anticipate future online draughts games / fragments / problems.
[...]

These seem all important goals to me. Regarding the short notation: I know this is a very sensitive matter for problemists. However, as far as I know there is no consensus about what valid short notation is. Until that is settled I'd rather not have it in the standard.

Then a final comment on the syntax for FEN strings. This was first described in Adrian Millet's attempt at a PDN standard http://homepages.tcp.co.uk/~pcsol/sagehlp1.htm#PDN

FEN can also perfectly be specified using an EBNF, with a strict and a forgiving variant. Perhaps we should look into that after the EBNF for PDN games itself has been fixed.

Wieger Wesselink · Post by **Wieger Wesselink** » Tue May 12, 2009 00:21

Piet Bouma wrote: Wieger, did you also try the output of a .pdn file from <a href=http://toernooibase.kndb.nl target=blank>Toernooibase</a>.
For example: http://toernooibase.kndb.nl/applet/appl ... 3&r=2&jr=8

I use here the "Delfts results". Also as result at the end of the .pdn file.
Does it mean, that I have to replace that with 1/2-1/2 ?? (or better "*")
Entry Dambo accepts this results!!
(Of course I can make the separator *, but why can't other results not be 'forgiven'. If you make the tag [Event] absolutely as the separator - start of a new game -, you can filter all different results quite easily, from the string before)

Until now I didn't try games from Toernooibase before. I wasn't even aware of the possibility to download games in PDN format . Indeed results like 4-6 are causing problems. My proposal is therefore to only use * as a separator between games. Regarding this I will have to change EntryDambo as well, as it allows arbitrary strings as results, and it even promotes using 2-0, 1-1 and 0-2.

Wieger Wesselink · Post by **Wieger Wesselink** » Tue May 12, 2009 00:25

FeikeBoomstra wrote:Can't it be:

GameBody -> Annotation? (GameMove Annotation?)*

An empty move sequence is still possible. This is handled by the Game rule:

Game -> (GameHeader GameBody?) | GameBody

I have done it like this to make sure a game cannot be empty.

FeikeBoomstra · Post by **FeikeBoomstra** » Tue May 12, 2009 00:37

Ok, I didn't study your EBNF carefully enough

Ed Gilbert · Post by **Ed Gilbert** » Tue May 12, 2009 02:32

My proposal is therefore to only use * as a separator between games.

I am confused by this. Are you proposing to use * as a game terminator only for games with an unknown result? In English checkers the asterisk at the end of the moves is commonly used to mean both a result code of unknown and a game terminator, so to use it as a terminator for games with results other than 'unknown' seems to signal conflicting results to legacy parsers. Also wasn't the asterisk a problem because of its other (non-pdn but still common) use as a strength indicator of a move, 'the only move to draw', etc.? I thought we were going to drop all use of these result codes as game terminators and rely on the header tags of the next game to serve that purpose.

-- Ed

Wieger Wesselink · Post by **Wieger Wesselink** » Tue May 12, 2009 08:54

Ed Gilbert wrote:
My proposal is therefore to only use * as a separator between games.
I am confused by this. Are you proposing to use * as a game terminator only for games with an unknown result? In English checkers the asterisk at the end of the moves is commonly used to mean both a result code of unknown and a game terminator, so to use it as a terminator for games with results other than 'unknown' seems to signal conflicting results to legacy parsers. Also wasn't the asterisk a problem because of its other (non-pdn but still common) use as a strength indicator of a move, 'the only move to draw', etc.? I thought we were going to drop all use of these result codes as game terminators and rely on the header tags of the next game to serve that purpose.

-- Ed

Sorry about the confusion that I caused. There are two use cases for which I now think that a game separator is relevant. The first one is when you want to store multiple positions without any moves in a PDN file. If there is no game separator, then all the FEN tags of the positions will become part of the same game. This seems undesirable to me. The second one is when you want to store multiple games without any tags in a PDN file. Also in that case a game separator is needed.

Contrary to what I thought in the beginning the asterisk is not causing problems as a separator. If we assume that there may be no space between moves and their strength indicators, then there can be no confusion. So '32-28* 19-23' is considered as one game, while '32-28 * 19-23' is considered as two games. Also the other results 1-0, 0-1 and 1/2-1/2 are not causing conflicts as long as a move is defined as a single token. There are no moves that have 1-0 or 0-1 as a prefix, assuming that we don't allow leading zeroes in numbers. The 1-1 result from international draughts does give problems, since it is a prefix of moves like 1-12 and 1-18.

Another reason for keeping the * as a game separator is backwards compatibility. I think many programs only use the result tag for determining the result , so they will still be able to read newer PDN files (but this must be verified).

Of course if the general opinion is that game separators should be left out, I will remove them again from the definition.

Piet Bouma · Post by **Piet Bouma** » Tue May 12, 2009 09:44

Wieger Wesselink wrote: Also the other results 1-0, 0-1 and 1/2-1/2 are not causing conflicts as long as a move is defined as a single token. There are no moves that have 1-0 or 0-1 as a prefix, assuming that we don't allow leading zeroes in numbers.

Oops. Another conflict. In Toernooibase a leading zero is common use. Also because of the use of the damweb-applet and damweb-notation.
A long time only damweb-notation has been stored in the database and is now on demand converted to .pdn notation with leading zero's.
Can a leading zero not be 'forgiven'?

World Draughts Forum

PDN standard

Re: PDN standard