Login to participate
  
Register   Lost ID/password?
Louis Kessler’s Behold Blog » Blog Entry           prev Prev   Next next

GEDCOM Assessment - Sat, 8 Feb 2020

I’ve working hard to get Behold 1.3 completed. It will primarily be a newer iteration of Behold’s Everything Report. Once that is released, I’ll start my effort to add GEDCOM export followed by editing.

I’ve designed Behold to be a comprehensive and flexible GEDCOM reader that understands and presents to you all the data contained in GEDCOM of any flavour, from 1.0 to 5.5.1 with developer extensions and user-defined tags. So when John Cardinal came up with his GEDCOM Assessment site, that was a opportunity I couldn’t resist.

“assess.ged is a special GEDCOM file which you may use to perform a review of the GEDCOM 5.5.1 import capability of any program that reads a GEDCOM file and imports the contents”

John is a long-time user of The Master Genealogist program written by Bob Velke. John is also a programmer and wrote programs to work with TMG including Second Site for TMG, TMG Utility and On This Day.

After TMG was retired in 2014, John wanted to help people get all their data out of TMG allowing them to transfer to other programs so he wrote the TMG to GEDCOM program. He also wrote a program that creates an e-book from a GEDCOM file called Gedcom Publisher. And John then wrote a program to create a website from any generic GEDCOM file and called that program GedSite.

In the process of all this, John gained an expertise in working with GEDCOM and has made tests for GEDCOM compatibility that he invites all genealogy software authors to try.

So try it I shall. 

I followed John’s “process” and downloaded version 1.03 of assess.ged file as well as the images file references and placed the latter in a C:GedcomAssessment folder. Then I loaded assess.ged into Behold 1.2.4 and used his website’s Data Entry page to capture the results. This really is a beautifully set up assessment system. My complements to John Cardinal.


A Few Things To Fix for Version 1.3

There were a number of tests that illustrated some aspects of GEDCOM that Behold does not fully support. I’ve made a list of them here:

  1. Behold by default uses the first person in the file and treats that person (and their spouse(s)) as the family the report is about. (You can of course pick anyone you want instead of or in addition to the first person). The assess.ged file does not link the 185 people in the file to each other, except for two who are connected as spouses. Behold was not using the first person in the file as a singular family but instead had the first section blank and listed all the people, including that first person, in its “Everyone Else” section. This should be a simple fix.
  2. I was surprised to see Behold display:  1 FACT Woodworking 2 TYPE Skills as Woodworking Skills rather than Skills: Woodworking. That’s a bug because I intended it the latter way. Same for 1 IDNO 43-456-1899 2 TYPE Canadian Health Registration which was being displayed as 43-456-1899 Canadian Health Registration rather than Canadian Health Registration: 43-456-1899.
  3. Behold somehow was ignoring and not displaying the TIME value on the change date of a record.
  4. The CONC tag to concatenate two lines is specified by GEDCOM to require the last word in the first line be split so that it’s second half begins the second line. Behold does this, but in doing so, Behold trims the lines before concatenating. As a result, if a GEDCOM used a non-standard method of including a leading space on the second line or a trailing space on the first line, then it is ignored and the word at the end of the first line and the beginning of the second line would be joined with no intervening space. I haven’t noticed programs using this non-standard format, but even so, I’ll think about it and maybe I’ll remove Behold’s trimming of concatenated lines in version 1.3.
  5. Behold displays: “Birth, date”.  But it should display “Birth: date”. Same for other events such as “Adoption, date” or “Baptism, date”. How did that ever happen?
  6. Behold currently displays the user-defined tag _PRIM as “Primary: Y” after a date, but retains the first-entered date as primary and does not use this tag to make that date primary. I think about deciding to honor the _PRIM tag in version 1.3.
  7. The non-standard shared event tag, e.g. 1 CENS 2 _SHAR @I124@ is not being displayed correctly by Behold. This will be fixed.
  8. Behold does not convert @@ in notes or text values to @, as it should. Technically all line items should be checked for @@ and changed as well so that includes names.
  9. Hyperlinks to objects unfortunately do not open the file because Behold added a period to the end of it. This is a bug that I noticed a few weeks ago and has already been fixed for the upcoming version of Behold under development.
  10. Alias tags (ALIA) whose value is the name of the person rather than a link is valid according to the GEDCOM standard, but it may be something I want to support if I see it was exported into GEDCOMs by some programs.
  11. I’m not displaying the tags under a non-standard top level 0 _PLAC structure correctly. This includes 1 MAP, 2 LATI, 2 LONG and 1 NOTE tags under the 0 _PLAC record.
  12. Non-standard place links such as: _PLAC @P142@ that link to the 0 _PLAC records should have been working in Behold, but the display of these links needs to be improved.
  13. If a person’s primary name has a name type, then it should be repeated with the type on the next line, e.g.
       Birth name:  Forename(s) Surname
    Also additional names should be called “Additional name” rather than just “Name”.
  14. Names with a comma suffix should not be displayed with a space between the surname and the comma. I’ve actually never seen this in the wild.
    e.g. /Smith/, Jr should be displayed as Smith, Jr and not Smith , Jr
  15. Notes on places are repeated and shouldn’t be.  Dates should be shown following any notes or other subordinate info for the place.
  16. Addresses could be formatted better.
  17. EVEN and ROLE tags on a source citation should have their tag text looked up and displayed instead of just displaying the tag name.
  18. The OBJE link was not included in source citation when it should have been.

So that was a really good exercise. Most of these are minor, but a lot more issues came up than I expected. Over the next few days, I’ll resolve each of these in the development version of Behold which soon is to become version 1.3.

  
Results and Comparison

John presents a Comparison Chart that currently compares the results for 15 programs. There are 192 tests. Here’s my summary of John’s Comparison.

image

I’ve added Behold’s result in my chart. I’ve also excluded John’s program GedSite in summarizing the other programs, because his results are for a program that has already been tuned to handle these tests. So GedSite’s numbers are a good example of the results that I and other developers should try to attain with our programs.

Behold didn’t do too bad with 161 supported constructs out of 192. Best was GedSite’s 185 followed by Genealogie Online’s 179, then by My Family Tree’s 169 and then by Behold’s 161. Genealogy Online is the baby of Bob Coret who is another GEDCOM expert, and My Family Tree is by Andrew Hoyle of Chronoplex Software who also makes GEDCOM Validator, so you would expect both of them to be doing well with regards to GEDCOM compliance.

I’ve emailed the JSON text file of Behold’s results to John. Hopefully he’ll add Behold to his comparison chart.


Comments About John’s Test File and Data Entry Page

  1. The assess.ged file version 1.03 includes a 1 SEX M line in each of the test cases. I’m not sure why. SEX is not a required tag in an INDI record. For a test file, it would be simpler to just leave the SEX lines out.
  2. I disagree with the constructs of two of the Master Place Link by XREF tests. They include within one event, both a standard test place link and the non-standard place xref link, i.e.: 
       1 CENS
       2 PLAC New York
       2 _PLAC @P158@
    The trouble I have with this is that GEDCOM only allows one place reference per event. By using this alternative tag, you’ve effectively got two which is illegal if they were both PLAC tags. And what if they are not the same? John should take out the 2 PLAC New York line from his NAME 02-Link by XREF tests where he has the 2 _PLAC tag so that there is only one place reference. Any programs allowing both PLAC and _PLAC tags on the same event should cease and desist from doing this. The second test where the 3 _PLAC tag is under the 2 PLAC tag is an even more horrible construct that no one should support.
  3. The GEDCOM Assessment Data Entry Page does not completely function in all browsers. When using my preferred browser Microsoft Edge, entering “Supported (w/comment)” did not bring up the box to enter the comment. I tried Internet Explorer and the page did not function at all. I had to switch to Google Chrome (or Firefox) to complete the data entry.


Conclusion

What this little exercise does show is how hard it is to get all the little nuances of GEDCOM programmed correctly and as intended. This assessment took the better part of a day to do, but I think it was well worth the time and effort.

And what’s really nice about having a file with test cases is that they provide simple examples that illustrate issues that can be fixed or improved.

I hope all other genealogy software authors follow my lead and test their programs with GEDCOM Assessment’s assess.ged file. Then it’s a matter of using this analysis to help make their programs more compatible with the standard and thus do their part to help improve genealogical data transfer for everyone.




Update  Feb 10:  John reviewed the assessment with me. A few results changed status and I’ve updated the table above. John mentioned that his creation of GedSite wasn’t a conversion of Second Site for TMG, but was a completely new program.

Behold version 1.2.4’s final assessment is now available here:
https://www.gedcomassessment.com/en/assessment-behold.htm

Once I complete version 1.3, I’ll likely submit it again for a new assessment.

1 Comment           comments Leave a Comment

1. coret (coret)
Netherlands flag
Joined: Thu, 15 Dec 2011
5 blog comments, 0 forum posts
Posted: Sun, 9 Feb 2020  Permalink

A small remark about the summary comparison table: readers should note that sometimes the “max” value is the best (like “Supported”) and sometimes the “min” value is the best (like “Imported incorrectly”).

 

The Following 1 Site Has Linked Here

  1. Best of the Genea-Blogs - 2 to 8 February 2020 - Geneamusings - Randy Seaver : Mon, 10 Feb 2020
    * GEDCOM Assessment by Louis Kessler on Behold Genealogy Blog.

Leave a Comment

You must login to comment.

Login to participate
  
Register   Lost ID/password?