PGN Utils

 PGNutils.Txt -  8 SEP 2006 - by Tom McCormick -
 Many of the freeware utility programs described below have been developed 
 under the Windows XP "command window" which emulates MS DOS. This is 
 reached by clicking on Start, then on Run, then key in CMD.EXE (for XP) or
 COMMAND.COM (for Windows 9x or ME) and press the Enter key.  Key in  EXIT 
 to return to Windows.

 I expect that these programs would run with ANY version of Windows,
 but I have not tested them with all other versions.

 All of these programs expect an input file to use Carriage-Return
 and Linefeed character pairs as line delimiters. PGN from Unix or
 Linux systems should be processed first using the crlf.exe utility 
 to convert the single-character "newline" to CR/LF before processing.

 All of these utilities will accept ANY size input file, however,
 when running PGNTRIM5 or PGNTRIM6 to normalize fresh pgn files,
 individual PGN GAMES larger than 8,000 characters will be sent
 into the .BAD file for review, and will not be in the new output
 file. This is done to permit the user to review each rejected game
 and decide to manually edit the problem, or discard that game.
 Any PGN game containing a [FEN tag will be passed through
 to the output file without any normalizing or correcting.

    Run PGNTRIM5 or PGNTRIM6 before running any of the other
    PGN utilities described at the end of this document.


 The comments about PGNTRIM5 apply to PGNTRIM6, a later version.
 PGNTRIM6 has only a few differences with PGNTRIM5...notably that
 semicolons are permitted within tags rather than treating them
 according to the PGN standard as signalling the beginning of a
 comment through the end of that line/record.

 PGNTRIM5.EXE is a freeware Windows utility to correct most PGN syntax 
 errors, and to direct games which need human review into a separate 
 output file named BADTRIM5.BAD. That file usually furnishes enough 
 information to the user so that a decision can be made to correct the 
 input file and run again, or to accept the number of games rejected 
 from the new output file. PGNTRIM5 never changes the original input file.

 PGNTRIM5.EXE requires no installation routine, nor any .DLL file(s).
 It runs from a Microsoft DOS prompt, or from a Microsoft Windows 95,
 98, ME, NT, NT2000, or XP command prompt, and probably under VISTA.

 PGNTRIM5 does NOT detect illegal/impossible moves; but PGNSCID.EXE is 
 freeware, and will catch most of these.

 I run pgntrim5 first against newly downloaded PGN files in order to 
 clean up common syntax problems and ommisions, and to drop text info
 such as titles and crosstables. Then I run the output file into pgnscid
 to catch any illegal moves.  This approach greatly reduces the amount of
 time needed to edit the PGN file for syntax errors before placing it 
 into a database.

 This reduces the amount of manual review and editing necessary to 
 rescue important games, or discard others from a file to end up 
 with a much cleaner PGN file for viewing, or for insertion into
 a database such as SCID, CHESSBASE, FRITZ, or BOOKUP Lite.

 PGNTRIM5 will "repair" correctable PGN syntax errors such as  cd4  
 which is changed to  cxd4 and e8Q  to  e8=Q     and f1 Q +    which
 becomes f1=Q+ etc. You may run the accompanying TEST.PGN file to see
 what PGNTRIMn will do, i.e.,  PGNTRIM6 TEST.PGN TESTOUT.PGN.... if you
 do not enter the filenames in the command tail, you will simply be
 prompted for them as the program begins.

 (P)ortable (G)ame (N)otation format is rather thoroughly defined and 
 effective as a means to record and distribute recordings of chess game
 moves.  This standard is available over the internet from several sources. 

 Recently, people have been submitting "annofritzed" PGN games to internet
 websites. These often reach more than 8,000 characters of movestext...and
 all too often contain unbalanced alternate move tokens (..)..) for which
 nesting IS permitted, or they may contain unbalanced curly brace tokens 
 {..}  delimiting comments.  Fritz will produce correct PGN syntax when 
 autofritzing, but humans seem driven to "improve or clarify" these comments
 and they frequently end up with these tokens unbalanced. The PGN standard
 forbids nested {..{...}..} curly braces, anyway.

 A common error made by players trying to enhance comments or alternate
 moves, is to use a semicolon ";". The PGN standard requires that ALL
 text following it in that input record be dropped as comments. If that
 occurs within {..} or within (...) then the closing character is
 dropped causing unbalance. Ahem...a STANDARD is a STANDARD, thank you.

 Recently, PGN games have been appearing on the internet which are 
 Fischer-Random games. If there is no FEN statement, or other indication 
 of this, then "illegal moves" such as 1.Nb3 will pass through syntax 
 checking, but will appear to be an illegal move to database programs!
 Some standard indication of Fisher-Random games is being debated, and 
 needs to be added to the PGN standard. Until then, PGNTRIM5 or 6 will
 not recognize a Fisher Random tag such as [Varient "Fischerandom"] until 
 the PGN standard is final.

 Chess magazines and books are not immune from typograhical errors and
 omissions such as leaving out moves entirely, leaving pieces off the
 diagrams, having two black Kings, no White King, displaying entirely
 the wrong diagram, etc.

 Persons collecting PGN chess game records do not want to end up with such
 problems that show up while a game is being studied!  Normalization 
 programs can detect most PGN problems, fix many, and tell the user about 
 the others so that they can be manually edited, or the game discarded. 

 PGNTRIM5 directs erroneous games into PGNTRIM5.BAD where they can be
 reviewed and edited separately from the clean output file. If you were
 to edit and delete some games from PGNTRIM5.BAD...leaving only games
 which you have `fixed`, then you could simply COPY the cleaned
 PGNTRIM5.BAD file to the clean output file as for example:
 copy newfile.pgn+pgntrim5.bad     (NOTE no spaces in the file list!
 and you may prefer to drop the [Warning tag from the corrected game).

 A PGN game recording example follows. Heading records are called "tags",
 and seven of them are required as a minimum....the first 7 shown below 
 are required in any PGN game. Other tags are optional such as the 
 "Opening" and "ECO" tags shown.  All tags must conform to standard in 
 order to be useful to a wide audience...Each tag must begin with [ and
 end with ], and the tag name must begin with one uppercase letter, the 
 text must be enclosed within quotation marks, etc.  It is somewhat 
 surprising just how many PGN games have simple syntax errors in the 
 tag records!

 NOTE: PGNTRIM5 can be forced to retain all tags into the output file by
       adding /alltags to the command tail, otherwise only the following
       tags are preserved:
          [Event  [Site  [Date  [Round  [White  [Black [Result
          [ECO    [Opening   [WhiteElo  [BlackElo and  [Comment
       Stripping the [Annotator, [PlyCount, [Clock, etc. etc. saves
       considerable file space, but if you MUST have them all then
       always include /alltags in the command line for example
       pgntrim5 2006WCC.PGN 2006WCC.TRM /alltags

       By default, there will be exactly four complete moves per line in
       the output file unless you specify between one and seven moves.
       One move per line is useful in teaching situations where you want
       the students to comment on each move (in writing). Four moves per
       line permits printing with a decent sized font without overflows.
       These are specified in the command tail, i.e.,

 Normalization programs detect deviations from standard, and either fix 
 the problem, notify the user, or both. Missing tags, illegal moves or
 incomplete moves such as B7, a8,  or  Rx can not be fixed and are 
 simply reported to the user for editing or discarding the game.

 Other problems such as spacing errors can usually be fixed by a 
 normalization program, so  Nxg4Nbd7 (no space between White and Black
 halfmoves) can be fixed to Nxg4 Nbd7,  and  O-O5. can be fixed to
 O-O 5.  Castling must use alpha O, not zeroes, a normalization program
 can easily substitute to fix PGNTRIMn does.

 Missing half-moves or entire moves can be detected and reported, as
 can a result code which does not match the [Result tag.

[Event "Example PGN Chess Game Record"]
[Site "Moscow"]
[Date "2003.12.25"]
[Round "2"]
[White "Blaganov"]
[Black "Dufus"]
[Result "1-0"]
[Opening "Scandinavian"]
[ECO "B01"]
1.e4 d5 2.exd5 Qxd5 3.Nc3 Qd8 4.d4 Nf6
 {B01 Scandinavian}
5.Bc4 c6 6.Nf3 Bg4 7.Bxf7 Kxf7 8.Ne5 Kg8
9.Nxg4 Nbd7 10.Qe2 Nxg4 11.Qe6#   0-1 


Here is an example "BEFORE" and "AFTER" using PGNTRIM5.
This very old game was annotated by the computer program
Fritz 6...a process called annofritzing. There are many
comments within curly braces {...}, NAG comments... $17,
move continations following an alternate move sequence,
and there are many nested alternate moves i.e., ( ( ( ) ) )

[Event "New Orleans"]
[Site "New Orleans"]
[Date "1849.??.??"]
[Round "?"]
[White "Morphy, Paul "]
[Black "J. MacConnell sr"]
[Result "1-0"]
[Annotator "Fritz 6 (6s)"]
[PlyCount "57"]
[EventDate "1849.??.??"]

1. e4 {C39: King's Gambit Accepted: 3 Nf3 g5 4 h4} e5 2. f4 exf4 3. Nf3 g5 4.
h4 g4 5. Ne5 h5 6. Bc4 Rh7 7. d4 d6 8. Nd3 f3 9. g3 (9. gxf3 Be7 10. Be3 Bxh4+
11. Kd2 Bg5 12. f4 Bf6 13. a3 c6 14. Nc3 Bh8 15. f5 Ne7 16. Qe2 Kf8 17. f6 Bxf6
18. Raf1 d5 19. Rxf6 dxc4 20. Ne5 Nd7 21. Nxd7+ Bxd7 22. Rh6 Rg7 23. R6xh5 Ng8
{Pektor,A-Zvara,P/Prague 1992/0-1 (48)}) 9... Nc6 10. Nf4 $146 (10. c3 Nge7 (
10... Nce7 11. Kf2 c6 12. Nf4 Qc7 13. Qb3 b5 14. Bd3 Rh8 15. Re1 Ng6 16. Nxg6
fxg6 17. e5 Ne7 18. Bxg6+ Kd8 19. Qf7 Nxg6 20. Qxg6 Qg7 21. Bg5+ Kc7 22. exd6+
Kb6 23. Bd8+ Ka6 24. Qxg7 Bxg7 25. Bc7 {
Abbe de Lionne & Morant-Maubisson & Auzout/Paris 1680/1-0 (40)}) 11. Nf4 a6 12.
a4 Bg7 13. Qb3 Bh8 14. Nxh5 Kf8 15. Nf4 Na5 16. Qa2 Nxc4 17. Qxc4 c6 18. Nd2 d5
19. exd5 cxd5 20. Qb4 Bf6 21. Nf1 Kg7 22. h5 Nc6 23. Qc5 Be6 24. Qa3 Qd7 {
Jannisson & Maubisson-Lionne & Morant/Paris 1680     (36)}) (10. Bb5 d5 11. Ne5
Bd7 12. Nxd7 Qxd7 $17 (12... Kxd7 $2 13. exd5 Bd6 14. Kf2 $18 (14. dxc6+ $6
bxc6 15. Ba4 Bxg3+ 16. Kf1 Rb8 $16 (16... Bxh4 $4 {
taking the pawn will bring Black grief} 17. Qd3 $18)))) 10... Bd7 (10... Nf6
11. Nc3 $17) 11. Nc3 Nf6 (11... Bg7 12. Be3 $17) 12. Be3 Ne7 (12... Bh6 13. Rf1
$17) 13. Kf2 c6 (13... Bh6 14. e5 dxe5 15. dxe5 $17) 14. Re1 Bg7 15. e5 dxe5
16. dxe5 Nfd5 (16... Nfg8 17. Ne4 Bxe5 18. Ng5 Bxf4 19. Nxh7 Bxe3+ 20. Rxe3 $11
) 17. Bxd5 (17. Nfxd5 Nxd5 18. Nxd5 cxd5 19. Qxd5 Bh8 $14) 17... cxd5 (17...
Nxd5 18. Ncxd5 cxd5 19. Nxd5 Be6 $14 (19... Bxe5 {
Black again will not be able to digest the pawn} 20. Bg5 f6 21. Nxf6+ Kf7 22.
Rxe5 (22. Qxd7+ $6 {is not possible} Qxd7 23. Nxd7 Bd4+ 24. Kf1 Kg6 $18) 22...
Qb6+ 23. Re3 $18)) 18. Bc5 (18. Ncxd5 Nxd5 (18... Bxe5 $2 {
is nothing because of} 19. Bb6 Qb8 20. Bd4 $18) 19. Nxd5 Be6 $14 (19... Bxe5 {
as before the pawn must remain untouched} 20. Bg5 f6 21. Nxf6+ Kf7 22. Rxe5
Qb6+ 23. Re3 $18)) 18... Bc6 (18... Rc8 19. Bxa7 Qa5 20. Bd4 $15) 19. b4 (19.
Qd3 Rh6 $11) 19... b6 (19... d4 $142 20. Qd3 Rh6 $15 (20... dxc3 21. Qxh7 Kf8
22. Rad1 $18 (22. Qxh5 $6 {is the less attractive alternative} Qd2+ 23. Kg1 Kg8
$18) (22. Nxh5 $4 {the pawn is indigestible} Qd2+ 23. Re2 Qxe2+ 24. Kg1 Qg2#)))
20. Bxe7 $14 Qxe7 {The isolani on e5 becomes a target} 21. Nfxd5 Qb7 $4 (21...
Bxd5 $142 {is just about the only chance} 22. Nxd5 Qd8 23. Nf6+ Bxf6 24. exf6+
Kf8 25. Qxd8+ Rxd8 $16) 22. Nf6+ $18 Bxf6 23. exf6+ Kf8 24. Qd6+ Kg8 25. Re7
Qc8 26. Rc7 Qf5 27. Qxc6 {Threatening mate: Qxa8} Qxc2+ (27... Rf8 {
does not save the day} 28. Nd5 Qe5 $18) 28. Ke3 Rd8 (28... Rf8 29. Rxa7 Qb2 30.
Ra8 Qxc3+ 31. Qxc3 Rxa8 32. Qc7 $18) 29. Rd1 $1 {
the end of the story. Threatening mate... how?} (29. Rd1 Rf8 30. Rxa7 $18) 1-0

...after processing the above file through PGNTRIMn, it appears as

[Event "New Orleans"]
[Site "New Orleans"]
[Date "1849.??.??"]
[Round "?"]
[White "Morphy, Paul "]
[Black "J. MacConnell sr"]
[Result "1-0"]
[Annotator "Fritz 6  6s "]
[PlyCount "57"]
[EventDate "1849.??.??"]

1.e4 e5 2.f4 exf4 3.Nf3 g5 4.h4 g4
5.Ne5 h5 6.Bc4 Rh7 7.d4 d6 8.Nd3 f3
9.g3 Nc6 10.Nf4 Bd7 11.Nc3 Nf6 12.Be3 Ne7
13.Kf2 c6 14.Re1 Bg7 15.e5 dxe5 16.dxe5 Nfd5
17.Bxd5 cxd5 18.Bc5 Bc6 19.b4 b6 20.Bxe7 Qxe7
21.Nfxd5 Qb7 22.Nf6+ Bxf6 23.exf6+ Kf8 24.Qd6+ Kg8
25.Re7 Qc8 26.Rc7 Qf5 27.Qxc6 Qxc2+ 28.Ke3 Rd8
29.Rd1 1-0


Here is an example "BEFORE" and "AFTER" using PGNTRIM5.
This game was annotated by the computer program Fritz8.
There are many {[%emt 0:00:00]} elapsed-time remarks
which unfortunately use sqare braces within the movestext!!
Although these are also within curly brace pairs, using
[..] square braces within the moves text area is a violation
of common PGN good practice, if not the standard, itself.
PGNTRIM5 will remove these as shown in the example below.

[Event "Fritz8 commentary removal test file"]
[Site "Howie in the Hills, Florida"]
[Date "2004.05.28"]
[Round "?"]
[White "Fritz 8"]
[Black "McGillicuddy, Sean"]
[Result "1-0"]
[ECO "B06"]
[PlyCount "75"]
[Comment "Unfortunately, Fritz 8 also uses funky comment spacing"

{286MB, Fritz8.ctg, Intel 2.5 WinXP
} 1. Nf3 {[%emt 0:00:00]} g6 {
[%emt 0:00:00]} 2. e4 {[%emt 0:00:00]} Bg7 {[%emt 0:00:03]} 3. d4 {
[%emt 0:00:00]} d6 {[%emt 0:00:04]} 4. Nc3 {[%emt 0:00:00]} Nc6 {[%emt 0:00:12]
} 5. Bb5 {[%emt 0:00:01]} Bd7 {[%emt 0:00:02]} 6. O-O {[%emt 0:00:02]} a6 {
[%emt 0:00:05]} 7. Be2 {[%emt 0:00:01]} Bg4 {[%emt 0:00:17]} 8. Be3 {
[%emt 0:00:01]} Nf6 {[%emt 0:00:10]} 9. h3 {[%emt 0:00:02]} Bd7 {[%emt 0:00:04]
} 10. Qc1 {[%emt 0:00:01]} O-O {[%emt 0:00:25]} 11. Qb1 {[%emt 0:00:02]} e5 {
[%emt 0:00:23]} 12. dxe5 {[%emt 0:00:02]} dxe5 {[%emt 0:00:14]} 13. Kh1 {
[%emt 0:00:01]} Re8 {[%emt 0:00:14]} 14. a3 {[%emt 0:00:01]} b5 {[%emt 0:00:20]
} 15. Bc5 {[%emt 0:00:02]} Be6 {[%emt 0:00:12]} 16. Qc1 {[%emt 0:00:02]} Qc8 {
[%emt 0:00:10]} 17. Qd2 {[%emt 0:00:02]} Bxh3 {[%emt 0:00:19]} 18. gxh3 {
[%emt 0:00:05]} Qxh3+ {[%emt 0:00:02]} 19. Nh2 {[%emt 0:00:00]} Nd4 {
[%emt 0:00:14]} 20. Rfd1 {[%emt 0:00:04]} Rad8 {[%emt 0:00:09]} 21. Qd3 {
[%emt 0:00:03]} Qc8 {[%emt 0:00:37]} 22. b4 {[%emt 0:00:03]} h5 {[%emt 0:00:10]
} 23. Rac1 {[%emt 0:00:03]} Bh6 {[%emt 0:00:07]} 24. Rb1 {[%emt 0:00:04]} Bf4 {
[%emt 0:00:14]} 25. Bf1 {[%emt 0:00:02]} Kg7 {[%emt 0:00:10]} 26. a4 {
[%emt 0:00:05]} c6 {[%emt 0:00:06]} 27. Bg2 {[%emt 0:00:02]} Rh8 {
[%emt 0:00:31]} 28. Nf3 {[%emt 0:00:07]} h4 {[%emt 0:00:08]} 29. Ne2 {
[%emt 0:00:05]} h3 {[%emt 0:00:04]} 30. Nfxd4 {[%emt 0:00:02]} exd4 {
[%emt 0:00:09]} 31. Bf3 {[%emt 0:00:04]} Ng4 {[%emt 0:00:07]} 32. Bxg4 {
[%emt 0:00:02]} Qxg4 {[%emt 0:00:09]} 33. Rg1 {[%emt 0:00:04]} Qh4 {
[%emt 0:00:38]} 34. Bxd4+ {[%emt 0:00:09]} Kg8 {[%emt 0:00:19]} 35. Rbf1 {
[%emt 0:00:06]} Rh6 {[%emt 0:00:24]} 36. Ng3 {[%emt 0:00:03]} h2 {
[%emt 0:00:56]} 37. Rg2 {[%emt 0:00:02]} Be5 {[%emt 0:00:17]} 38. Nf5 {
[%emt 0:00:02]} 1-0

[Event "Fritz8 commentary removal test file"]
[Site "Howie in the Hills, Florida"]
[Date "2004.05.28"]
[Round "?"]
[White "Fritz 8"]
[Black "McGillicuddy, Sean"]
[Result "1-0"]
[ECO "B06"]
[PlyCount "75"]
[Comment "Unfortunately, Fritz 8 also uses funky comment spacing"

1.Nf3 g6 2.e4 Bg7 3.d4 d6 4.Nc3 Nc6
5.Bb5 Bd7 6.O-O a6 7.Be2 Bg4 8.Be3 Nf6
9.h3 Bd7 10.Qc1 O-O 11.Qb1 e5 12.dxe5 dxe5
13.Kh1 Re8 14.a3 b5 15.Bc5 Be6 16.Qc1 Qc8
17.Qd2 Bxh3 18.gxh3 Qxh3+ 19.Nh2 Nd4 20.Rfd1 Rad8
21.Qd3 Qc8 22.b4 h5 23.Rac1 Bh6 24.Rb1 Bf4
25.Bf1 Kg7 26.a4 c6 27.Bg2 Rh8 28.Nf3 h4
29.Ne2 h3 30.Nfxd4 exd4 31.Bf3 Ng4 32.Bxg4 Qxg4
33.Rg1 Qh4 34.Bxd4+ Kg8 35.Rbf1 Rh6 36.Ng3 h2
37.Rg2 Be5 38.Nf5 1-0


 PGN2ONE  reads normalized PGN and creates one-record-per-game
 and prepends a 40 character sort key which can be used to sort
 by White, Black, ECO, Number of moves in game, Year of game, etc.
 Some batch files have been included in to use qsort
 and perform each of the above sorts. 

 I refer to this output format as .111 format indicating one 
 line/record per game. The prepended sort/selection record area
 provides exactly consistent locations for important data needed
 to sort and select games. This prepended area is removed when the
 .111 format is converted back to PGN by either ONE2PGN or PGNUNDUP.

 The first 7 letters of player names is optimum because it reduces
 misspellings. Before you challange this approach, look up the
 word  OPTIMUM. It would also be rather awkward to obtain fixed
 positions for full names such as Leko, Nimzowitsch, etc.

 Here are some examples of one-record-per-game created by PGN2ONE: can see how easy it is to select or sort by critical elements.

1   5   10   15   20   25   30   35   40 ...see BYYEAR.BAT etc. examples.

  White   Black   Year Mvs Re Site ECO 
{ Adams   Kasparo 1992 022 0-1 Dor D31} [Event "?"][Site "Dortmund"]...
{ Anand   Kasparo 1998 024 1/2 Lin B55} [Event "It "][Site "Linares "]...
{ Bareev  Kasparo 1999 021 1/2 Sar D80} [Event "It "][Site "Sarajevo "]...
{ Beliavs Kasparo 1979 035 1-0 Min A61} [Event "?"][Site "Minsk"]...
{ Karpov  Kasparo 1996 045 1/2 Las D20} [Event "It "][Site "Las Palmas "]...
{ Kasparo Anand   1999 033 1-0 Wij A45} [Event "Blitz "][Site "Wijk aan Zee "]
{ Kasparo Huebner 1992 048 0-1 Col C23} [Event "?"][Site "Cologne"]...
{ Kasparo Ivanchu 1999 036 1-0 Lin D11} [Event "It "][Site "Linares "]...

 Creating one record per game in this way also facilitates the use
 of the FIND command to select or reject games containing certain
 text strings.  FIND comes will all versions of DOS or Windows.

 For example:

 find "O-O-O" MyBig.111 >CastLong.111
              {the > redirects output to a new file instead of the display.}
 find /V "O-O" MyBig.111 >NoCastl.111
              {selects only games in which neither player castles.}
              {the "/V" command parameter OMITS matched records.}
 find  "C02" MyBig.111 >FrAdvan.111
              {this outputs games of French Defense, Advance var.}
 find  "2004" MyBig.111 >2004Only.111
              {this outputs only games played during year 2004.}
 find /V "1/2" MyBig.111 >NoDraws.111
              {drops drawn games of any length.}

 Two or more `find` executions can be used to refine selections further:

 find  "Leko" MyBig.111 >Leko2.111
              {this outputs games played by Leko as White or Black.}

 find  "Kramnik" Leko2.111 >LekoKram.111
              {this outputs games played between Leko & Kramnik.}


 find "0-1" MyBig.111 >BlackWin.111
              {drops draws and incomplete games.}
              {drops games shorter than 20 moves.}

 find "1-0" MyBig.111 >WhiteWin.111
              {drops draws and incomplete games.}
              {drops games shorter than 20 moves.}

 copy BlackWin.111+WhiteWin.111 WinsOnly.111

 find " 26." /V WinsOnly.111 >Miniat25.111
              {outputs games shorter than 26 moves.}

 ONE2PGN reads the one-record-per-game file (any sequence) created 
 using PGN2ONE, and outputs a new PGN file. If the one-record-per-game
 file as been sorted on positions 1 to 40 with the intention of
 dropping duplicate games, then PGNUNDUP should be used instead of 
 ONE2PGN in order to drop duplicate games. ONE2PGN will NEVER drop
 a game, even if it is a duplicate game...and therefore the input
 file sequence to ONE2PGN is of no MAY or MAY NOT
 sort it into any sequence as you wish.


 PGNUNDUP reads the sorted output from PGN2ONE, and creates normal
 PGN from the incoming 1-record-per-game. The input file is expected
 to be in ascending sequence on positions 1 to 40 so that duplicate
 games can be detected and dropped. The input file MUST be in that
 sequence to use this program, else it will tell you that the input
 file is "out of sequence".

 PGNbest6  reads normal PGN, looks in a plain text table (provided)
 for the 3,000 or so greatest player names of all time, and outputs
 games if EITHER player is on that list.  Kasparov vs Amatuer will be
 written to the new output file, but NoName vs. Amatuer will not.

 The PGNbest6.RAT plain text ratings file (user modifiable) should
 be in the same folder as the .exe file. This file may be in ANY
 sequence, but I find alphabetical by name easier to update!


 Always run PGNTRIM5 first to normalize the PGN syntax.

   Example:  pgntrim5 05Linar.pgn 05Linar2.pgn 

   Example:  pgntrim5 05Linar.pgn 05Linar2.pgn /MPL:5
             NOTE: /MPL:n where n is 1 to 7 sets moves per line in output}

 By using PGN2ONE.exe, you create one line (record) per game, and 
 prepend sort fields (columns) to it. You make sorting or selecting 
 much simpler since several key sort items are in fixed positions!
 Utility programs are provided to convert PGN to one-line-per-game
 and back again after sorting or selecting has been done.  For example,
 here are a few records illustrating this format:

  White   Black   Year Moves
                              Site ECO

{ Kramnik Kasparo 2003 018 1/2 Lin D11} [Event "XX SuperGM"][Site "Lin...
{ Radjabo Leko    2003 046 0-1 Lin E12} [Event "XX SuperGM"][Site "Lin...
{ Anand   Ponomar 2003 064 1-0 Lin C65} [Event "XX SuperGM"][Site "Lin...
{ Vallejo Anand   2003 030 1/2 Lin A30} [Event "XX SuperGM"][Site "Lin...
{ Kasparo Radjabo 2003 039 0-1 Lin C11} [Event "XX SuperGM"][Site "Lin...
{ Ponomar Kramnik 2003 040 0-1 Lin B30} [Event "XX SuperGM"][Site "Lin...
{ Radjabo Ponomar 2003 011 1/2 Lin D30} [Event "XX SuperGM"][Site "Lin...
{ Kramnik Vallejo 2003 030 1/2 Lin D15} [Event "XX SuperGM"][Site "Lin...
{ Leko    Kasparo 2003 087 1/2 Lin B55} [Event "XX SuperGM"][Site "Lin...
{ Bacrot  Adams   2003 045 1/2 Rey A45} [Event "Hrokurinn"][Site "Reyk...

 The data between the {...} are sort keys. You may use ByECO.BAT to sort
 such a file by ECO opening code as shown above. You may use ByYear.bat,
 etc. for other sequences. The command-line "find" utility which comes with
 DOS and Windows may be used against this format very handily since
 selections will be entire games...ready for ONE2PGN to restore to PGN.

   Example: pgn2one 03twic.pgn 03twic.111
            find "Karpov" 03TWIC.111 > 03Karpov.111
            one2pgn 03Karpov.111 03Karpov.pgn

 To combine many PGN files, the copy command will suffice.

   Example: copy  02linar.pgn+03linar.pgn+04linar.pgn    0204lin.pgn
           {NOTE  No spaces in the multiple file names}  {New output}

   Example: copy *.pgn Feb25.pgn


 Additional (and somewhat repetitive) comments about these utilities.

 PGN files from Unix or Linux based computers use a single newline
 charater to terminate each line.  Windows PCs require a pair of
 characters for this, and the utility program named crlf.exe will 
 `fix` PGN files from Unix/Linux sources to work properly on Windows
 PCs. PGNTRIM5.exe should then be used to normalize the combined PGN
 file as follows:  pgntrim5 RawPGN.pgn Normal.pgn

 When combining several PGN files into one PGN file 
 (i.e., COPY 2006.PGN+06*.PGN) you may run across some files that
 originated on a Unix/Linux computer and therefore have only a linefeed
 separator for lines (called a newline, or /n). For Windows PCs, you
 need to run the combined file through CRLF.EXE to insure that every
 line is terminated by a CR/LF character pair as Windows expects.

 require a new output filename, and DO NOT change the original file.
 QSORT and CRLF require only one filename, and it becomes changed...
 so make a backup file first if you are worried.

 Games rejected by PGNTRIM5 will appear in a file named
 BADTRIM5.BAD where they can be reviewed, and then edited or
 discarded. For example, games having zero or one move are rejected,
 games with nested curly-brace comments such as
 {23.Qb6 was better {then if 23...Ka8....}}  etc. are contrary to 
 the PGN standard.  Nested alternate moves within parentheses are 
 proper, and are handled by pgntrim5 unless they are 'unbalanced' that case they are also sent to the .BAD file. The ...BAD file
 continues to grow and grow until you delete it, then a new one will
 be created when needed.

 PGNTRIM5 fixes many common syntax errors and omissions so that the
 output file conforms very closely to the `export format` for PGN. 

 Illegal or impossible moves are detected later when the normalized
 PGN files are imported into a database (as with pgnscid, for example).

 PGN2ECO3 may optionally be run then to assign ECO codes using
 the first 4 moves compared to the plain text table
 which is provided. Be careful modifying this .eco file since
 the sequence AND completeness are important. PGN2ECO3 will use
 the last match it found in as it moves down the
 list. The sequence and content of is critical!
 An [Opening tag for each game will also be inserted into the PGN
 output file if no such tag existed. For help, enter  PGN2ECO3 /?

 PGN2ONE.exe converts normalized PGN format files to one-record-
 per-game plus a 40-char prefix of useful sort `fields` (or columns).
 QSORT or any such program can then be used to insure the sequence
 of the .111 format file. (.111 is my convention, any filetype may
 be used). If the file is sorted from position 1 through 40, then
 duplicates may be dropped if PGNUNTAG.exe is then executed.
 An example command to drop duplicates from a sorted .111 file is:
 pgnuntag amber.111 amber.pgn

 The one-record-per-game format has many useful functions:
     1. It is simpler to select or drop large sets of games as for
        example dropping all draws of less than nn moves, selecting
        only games from one or several specific years.  The `find`
        command, or a custom-written program is useful for all this.
     2. The games can easily be sorted by White, Black, ECO, number
        of moves in the game, result (1-0,0-1, etc.), or year.
     3. When sorted on positions 1 to 40, and used as input to
        PGNUNDUP.exe, duplicate games are dropped as the new output
        file is written.
     4. Games having desired characteristics such as neither player
        castling can be easily selected.  Likewise for castling long,
        checkmate, or certain variations within an ECO opening code.

 If duplicates need not be dropped, you may use ONE2PGN.exe to convert
 the .111 format back to PGN with one2pgn Leko.111 Leko.pgn

 Finally, if desired, a PGN file can be reduced to contain only games
 where both players are rated 2450 or above by using PGNBEST6.exe i.e.,
 pgnbest6 04all.pgn 04best.pgn
 pgnbest6 04all.pgn 04best.pgn /ELO        which causes [WhiteElo and
 [BlackElo tags to be updated or created.

 PGNBEST6.exe uses the plain text ratings file PGNBEST6.rat to select
 the strongest players. This file need not be in alphabetical, or any
 other sequence, but may be easier to maintain in alphabetical sequence.
 One method is to capture new FIDE ratings lists into the .rat format,
 and keep `classic` masters such as Alekhine, Fisher, etc. at the end
 where they can simply be copied as one block into a new .rat file.
 Since the `Classic` players are not gaining new ratings, an estimated
 rating is used, and would only apply to games in which they played.
 Up-and-coming new masters (such as Magnus Carlsen) should be added to 
 cause their games to be selected.  PGNBEST6.rat uses the first seven
 characters of player names for selections and matches since this has
 been found by lengthy testing to be optimum.


 S U M M A R Y

 I run pgntrim5 immediately after downloading PGN files from the internet.

 If I am going to combine two or more PGN files, I do that next using
 some variant of the "copy" command to achieve the desired result.
 For example:   copy twic48*.PGN+twic49*.pgn+twic50*.pgn twic2004.pgn

 If I am going to select only the games where both players are very strong,
 then I run pgnbest6 using the plain-text user-modifiable table file
 named pgnbest6.rat


 The following files are contained in the archive file PGNUTILS.ZIP

111TOELO.EXE   Reads .111 records, adds GM Elo if none present.
111TOELO.RAT   Used by above program. Includes classic Grandmasters.

111YEAR.EXE    Reads .111 in any sequence, appends .111 to specific year
               files such as 1854.111, 2001.111, etc.

2600PLUS.RAT   Recent ratings of Grandmasters with FIDE rating of 2600+.

BYECO.BAT      These batch files sort .111 files in different ways....

BYPLAYER.EXE   Reads .111 in any sequence, writes Leko.111, Anand.111, etc.

CHOICE.DOC     Free utility from Microsoft for making choices in batch files.

CRLF.EXE       Scans any text input (including PGN), and insures all lines
CRLF.TXT       ...end in CR/LF rather than just /N (Linefeed).

ECOBYDES.TXT   Gives ECO code from opening description.
ECOBYECO.TXT   Gives opening description from ECO code.

FILE_ID.DIZ    Brief summary of PGNUTILS.ZIP for website file listings.

FIXCRLF.EXE    Similar to CRLF

ICCF.RAT       Correspondance chess ratings list

ONE2PGN.EXE    Convert .111 format in any sequence back into PGN.

PGN2ECO3.ECO   OPTIONAL: reads and writes PGN assigning ECO only if missing.
PGN2ECO3.EXE   --------  Caution...these are APPROXIMATE ECO codes, only.

PGN2ONE.EXE    Converts PGN file to .111 file of one line per game.

PGNBEST6.EXE   Reads PGN, writes PGN of GMs having FIDE Elo of 2450+

PGNSCID.EXE    Loader for SCID database will catch illegal or impossible
               ...moves which PGNTRIM5 misses.

PGNTRIM5.DOC   This is the primary normalization program for PGN files.
PGNTRIM5.EXE   Read the .DOC file for further details.

PGNUNDUP.EXE   Reads SORTED .111, writes PGN while dropping duplicate games.

PGNUTILS.TXT   This file

QSORT.BRF      Brief documentation...all you need is in here!
QSORT.DOC      Full documentation
QSORT.EXE      Command-line sort utility handles ANY size file.

TEST.PGN       Important test data exercising PGNTRIM5 & showing functions.

UN_EOF.EXE     Removes excessive end-of-file characters leaving only one!