After several years, much assistance from Behold's great GEDCOM validation, advice from Louis and support from Darrin and the User Group of The Next Generation of Genealogy Sitebuilding (TNG), I now have a GEDCOM exported file which meets 5.5.1.
What a pleasure it was to see
'No problems to report.'
in the File Information section of The Everything Report and
'Summary of all GEDCOM Messages
No problems to report.'
in the log file.
Thanks all for the assistance.
Don't jump for joy just yet. Behold won't do complete GEDCOM checking until Version 1.5 and right now it lets a few things go.
You should also try a couple of other validators: VGedX at http://ancestorsnow.com/tools/vgedx.php and GED-inline at http://ged-inline.elasticbeanstalk.com/validate
I wouldn't mind hearing what they catch that Behold didn't.
OK, so I did jump too early.
I have validated through the above and also Genealogica Grafica and GenMatcher.
I will report all errors in an email.
In the meantime, the one error of most interest is:
Invalid content for DATE tag: '28 APR 1997' is not a valid <DATE_VALUE>
This does not error in Behold. Note the 2 spaces between APR and 1997.
Good catch. Behold was not reporting extra spaces between date parts. Next version will.
Comparing the GED-inline and VGedX reports you sent me, I see that GED-inline reports all 7 problems that VGedX does but with better messages, and then finds 17 more problems, most of which are CONT tags where they are not allowed.
Darrin won't like the first error though. "Invalid content for SOUR tag: 'The Next Generation of Genealogy Sitebuilding' is not a valid <APPROVED_SYSTEM_ID>". It is not checking against a list, but is simply reporting that the maximum length allowed for <APPROVED_SYSTEM_ID> is 20 characters.
I have a similar issue with the Version_Number, where I'd like to put both the Behold version number and the date in that field, but only 15 characters are allowed.
Behold in Version 1.5 should catch all the errors that Ged-inline and VGedX report and maybe more. And then Behold will convert the invalid GEDCOM input into valid GEDCOM output.
The warning that VGedX gives that record 0 @I1884@ INDI is not referenced is already handled by Behold - not as a message - but it is places in the Unconnected Individuals section of the Everything Report.
Does the GEDCOM character limit include the TAG?
If no, would this be suitable for vs and date?
VERS xx.x.x mm-dd-yy
What I'll probably do is:
1 SOUR Behold
2 VERS 1.0.4
2 NAME Behold version 1.0.4, 23 Jan 2012
Following your excellent advice here is the report from GedInline:
*** Line 11: Invalid content for CHAR tag: 'ANSI' is not a valid <CHARACTER_SET>
*** Line 36: Tag NOTE is not allowed under SUBM
*** Line 51: Invalid content for QUAY tag: '4' is not a valid <CERTAINTY_ASSESSMENT>
*** Line 73: Invalid content for QUAY tag: '4' is not a valid <CERTAINTY_ASSESSMENT>
*** Line 9946: Line contains tab character, not allowed under GEDCOM rules
*** Line 13032: Line contains tab character, not allowed under GEDCOM rules
*** Line 13322: Line contains tab character, not allowed under GEDCOM rules
*** Line 14296: Line contains tab character, not allowed under GEDCOM rules
*** Line 43757: Line contains tab character, not allowed under GEDCOM rules
I've no idea of how the tabs wound up in the file, but they will be gone shortly :) Likewise a fix for the rest, easy enough to do once you know what is what!
I notice that this is an addition to a older post, but I thought you would like the information to add to the list of clean file versus validators. VgedX only reports un-referenced individuals as do you as mentioned. You might at some point add a 'take me to the problem feature'---just a thought :)
Thanks for that info, hsm. After I add full GEDCOM checking to Behold, it should catch these errors and more.
I got to looking at the list while repairing it and noticed that the line that says:
*** Line 36: Tag NOTE is not allowed under SUBM
is quite clearly in-correct! Certainly allowed under 5.6 and 5.5.1 (which I use) but not allowed under 5.5. Makes me wonder how they parse the GEDC.VERS<VERSION_NUMBER>? I always (for no particular reason that I remember) simply use a dotted numeric string; 5.5.1. I believe that I got the idea from PAF, likewise from the example show on pages 60, 74 and 78 of the 5.5 standard, the 5.5.1 standard and the 'draft' 5.6 standard. I might ask them come to think of it!
Yes, the difficulty in checking GEDCOM is that what is valid in one version is not necessary in another. And the specs are not unambiguous and miss things in their specifications.
The bigger problem occurs when a program writes GEDCOM 5.5 and says it is 5.5.1 or vice versa.
You must login to post your reply.
Also check out my freeware programs: GEDCOM File Finder and Double Match Triangulator
Copyright © Louis Kessler
All Rights Reserved