The World’s Oldest GEDCOM File? - Sun, 17 Aug 2014
While preparing my presentation of Reading Wrong GEDCOM Right for the Gaenovium Conference, I wanted to see if I had in my collection of over 600 test GEDCOM files some early GEDCOMs from the pre-GEDCOM 5.0 era.
I searched my files for some of the pre-GEDCOM 5.0 tags outlined by Tamura Jones in his GEDCOM Tags article. I didn’t have any such files. So I searched the web. I was surprised to find just one single file, It was among the collection of GEDCOMs at the now abandoned Genealogy Forum site.
The file I found was gedr6127.ged and the start of it looks like this:
The file information for this file at Genealogy Forum states that it was uploaded by Jean Hudson Masco on April 21, 1997. This is well past the introduction of the GEDCOM 5.0 draft in December 1991. The file header states that it was created with PAF, so the file must have been created by what in 1997 was a very old version of PAF. The VERS tag is not given in the HEAD section of this GEDCOM, so the version of PAF cannot be identified, nor can the version of GEDCOM that this file represents.
This was a very exciting find for me, sort of like an archaeological dig unearthing an ancient unknown language. I don’t know anyone who has a specification of GEDCOM prior to version 5.3 (if anyone does, please let me know), so now became a matter of interpreting the text and seeing if I could translate.
As it is, the current version of Behold cannot display the people and individuals in this file correctly. The first problem is that on the 0 INDI record lines, there is no space between the end of the identifier, i.e. @242@, and the tag, i.e. INDI.
This also is a problem on the 0 FAM tags, except in this file they are not FAM tags but FAMI tags with an “I” on the end.
The other interesting difference is the linkages. Look in the above example and you’ll see two lines containing: 1 PARE 2 RFN @89@. This is a link to the parents of the person, and in version since GEDCOM 5.3, this has become a single line: 1 FAMC @89@.
All the other linkages were different as well. The list is:
- FAMC was PARE + RFN
- FAMS was FAMF + RFN
- CHIL was CHIL + YOUN
- HUSB was HUSB + RFN
- WIFE was WIFE + RFN
and I’m still working on the extra one they had which now has no equivalent:
- SIBL + OLD
which seems to be a linkage to a sibling which should be redundant information, but I’ll check that.
The dates are also in yyyymmdd format which has been changed in newer GEDCOMs to dd MMM yyyy. In a way, the old version was better, because it is the basic ISO standard for date representation. Within a GEDCOM file, it doesn’t matter how a date is stored. The GEDCOM file is not meant to be viewed by the genealogist. It is your genealogy software that simply must load the information and display it understandably for you. And using English month names for the 3-letter abbreviation does more harm than good. So I’m not sure why later versions made this change.
So I have now changed my development version of Behold so that these situations will be handled (and this will be included in the next release of Behold in case anyone else happens to have some ancient GEDCOMs lying around.) Once I did that, Behold was able to properly present the information in the file.
Do you have any of these ancient GEDCOMs lying around in this format? The sure way to tell is if the file ends with a line containing: 0 EOF. Newer GEDCOM versions end with 0 TRLR. I wouldn’t mind having a few more for testing, so if you have an oldie, please contact me.
Followup: Aug 21, 2014
This file has now been confirmed to be a GEDCOM 2.0 file.
Discussion the next day with Tamura Jones led to the conclusion that there is an older file available, GEDCOM 1.0, and the program Family History System by Phillip Brown seems to be the only program built to export to (and import from) that earliest format. Even PAF only started with GEDCOM 2.0.
See the GEDCOM 1.0 article by Tamura Jones for interesting information on this.