Login to participate
  
Register   Lost ID/password?
Louis Kessler’s Behold Blog » Blog Entry           prev Prev   Next next

A Recipe for GEDCOM - Sat, 3 Aug 2013

You can hear rumblings from the GEDCOM volcano of genealogy bloggers.

On August 1, James Tanner wrote Sharing Data Files or What Happened to GEDCOM? He says most genealogists are not aware there is a file compatibility issue until they are personally faced with the problem.

Dear Myrtle wrote Genealogy data sharing REVISITED where she laments the fact that FamilySearch is “dropping the GEDCOM X ball”.

Randy Seaver wrote Standards, GEDCOM, FHISO, and my Genea-Fantasy with the idea that the FamilySearch API might be the vehicle.

James Tanner followed up his first post the next day with: Is Moving Towards a Solution for Establishing Data Communications Standards Possible? where he argues that developers have no incentive to share their program’s unique features with the standard.

I’ve been working with GEDCOM for over 15 years. In that time it hasn’t changed. The standard has been GEDCOM 5.5 and the de facto standard has been GEDCOM 5.5.1. My work with Behold has made me have to delve into and try to understand every nook and cranny of the existing standard, as well as find a way to handle the non-standard extensions that vendors are including.

I’ve followed and contributed to many technical GEDCOM discussions on the GEDCOM-L mailing list, BetterGEDCOM, Gedcom X and have supported and contributed a paper to FHISO. I’ve watched with interest as various programs implemented different ways, outside of GEDCOM, to transfer data. And I’ve seen and looked at dozens of GEDCOM alternatives that have been proposed.

So what can we do to bake a new GEDCOM cake?

All the needed ingredients are now available:

1. The GEDCOM standard, used at least partially by almost all programs today.

2. Dozens of alternative standards that would change the world for everyone.

3. Billions of ideas of all the teensy details that “need” to be in the new standard.

We have the cooks:

1. A FHISO team to do the baking.

2. Promised support by dozens of top vendors and genealogical organizations.

Here’s the recipe:

1. Start with the goodness. A lot of thinking went into GEDCOM. The fact that some parts of it were not used by some vendors is not entirely GEDCOM’s fault. Decide what is good about GEDCOM and keep it.

2. Find what’s broken and doesn’t work. Fix those things. Don’t fix what works.

3. Keep it simple. Every complexity makes it more difficult for the developer to integrate it to their data model and to program correctly. This will spoil the cake.

4. Don’t add something unless it’s absolutely necessary. Nobody will be able to lift a cake that’s too heavy.

5. Don’t strive for perfection. The last 5% takes 95% of the time. The wedding won’t wait for the cake. Improvements can be added in future revisions.

6. Bake the first cake quickly (strive for a year). Get tasters to try it and tell you what is good and what isn’t. Then update your cake recipe regularly (every two years).

I’m excited for FHISO. They are starting anew with Drew Smith at the head, and I’m looking forward to see if his common sense and leadership will bring this to fruition.

6 Comments           comments Leave a Comment

1. geneg (geneg)
United States flag
Joined: Sat, 3 Aug 2013
1 blog comment, 0 forum posts
Posted: Sat, 3 Aug 2013  Permalink

What is the key non-compatibility issue among different vendors’ use of GEDCOM? Is it extensions to the schema to represent different kinds of metadata? Or is it something more fundamental about how facts and evidence are related? Or something else?

2. Louis Kessler (lkessler)
Canada flag
Joined: Sun, 9 Mar 2003
149 blog comments, 202 forum posts
Posted: Sat, 3 Aug 2013  Permalink

Gene,

There’s no “key” here. It’s all over the board.

James Tanner may be right that compatibility has never been important to developers. But they need to realize that people are getting smarter and it’s going to be more difficult to get someone to switch to their program if it is known that some things won’t get in and later not everything will come out.

Louis

3. Justin (justincyork)
United States flag
Joined: Sat, 3 Aug 2013
1 blog comment, 0 forum posts
Posted: Sat, 3 Aug 2013  Permalink

The community at large seems to have abandoned the idea that GEDCOM can be fixed. Why do you think that is?

4. Louis Kessler (lkessler)
Canada flag
Joined: Sun, 9 Mar 2003
149 blog comments, 202 forum posts
Posted: Sat, 3 Aug 2013  Permalink

Justin,

Excellent question.

Here’s 4 reasons why the community doesn’t think GEDCOM can be fixed:

1. The community only sees that their data does not transfer and blame it all on GEDCOM. In fact, only part of it is GEDCOM, and those parts can be fixed.

2. They see that GEDCOM has not been updated in 20 years, and conclude it is old and in need of complete replacement. But I would say that If GEDCOM had been kept up, it likely would have had no more than three or four relatively minor updates. Developers have not been keen to update to match a 20 year old unmaintained standard, whereas if the standard was being maintained, most developers would have kept up and data would be transferring much better today. We should now make these necessary updates. They won’t be as extensive as most people think.

3. They feel GEDCOM’s syntax is not a modern standard. In fact, GEDCOM’s syntax is defined by the Extended Backus-Naur Form (EBNF) which the International Organization for Standardization has adopted.

4. They feel GEDCOM’s syntax needs to be replaced with XML or JSON or whatever, because all modern tools use those. GEDCOM’s syntax can easily, mechanically and trivially be mapped to XML or JSON, and simple programs can be written to do that transfer. If that was so important, why aren’t there any major programs today that output GEDCOM as XML and promote the use of it? GEDCOM’s syntax has advantages over XML and JSON in that it is more compact, more readable, and gives just one way to format the data (compared to data in XML that can be stored as elements or as attributes). As a programmer, I don’t care about the format. GEDCOM syntax or XML or JSON to me are equivalent and have nothing to do with data transfer ability.

So as you see, I don’t think any of those four arguments hold up.

Louis

Followup - Check out Justin’s blog post: Everybody Benefits from Data Portability where Justin refutes some of James Tanner’s beliefs.

5. coret (coret)
Netherlands flag
Joined: Thu, 15 Dec 2011
2 blog comments, 0 forum posts
Posted: Sun, 4 Aug 2013  Permalink

Louis,

I like your recipe for GEDCOM, I can smell the result already!

I am not certain though, what community you (and Justin) are refering to. I do not think it is ‘all genealogists’. I think a large part of ‘all genealogists’ use one program and are satisfied, even when transfering data to an online service they don’t see/mind that some details are missing. I’ve also experienced that it’s hard to explain to ‘all genealogists’ what a standard is and why we need standards and a standards organisation like FHISO. This is another reason why good GEDCOM import/export isn’t a top priority of vendors.

If by community you (and Justin) mean ‘genealogists who recognize the need for standards’ than maybe it’s due to the fact that the GEDCOM standard is copyrighted by FamilySearch and therefor the community cannot change this de-facto standard.

Has anyone asked FamilySearch to remove the copyright of GEDCOM?

Bob

6. Louis Kessler (lkessler)
Canada flag
Joined: Sun, 9 Mar 2003
149 blog comments, 202 forum posts
Posted: Sun, 4 Aug 2013  Permalink

Bob,

With BetterGEDCOM, I do recall discussing the copyright. In the BetterGEDCOM requirements, the copyright is listed as one of the requirement constraints. But I don’t remember any formal inquiry being made about it.

Yes, you are correct. If FHISO or anyone else decides to update, rather than rewrite GEDCOM, then they’ll have to apply for permission to do so.

Actually the copyright in GEDCOM is shown as: “The Church of Jesus Christ of Latter-day Saints”. FamilySearch is a genealogy organization that they operate.

Louis

 

The Following 1 Site Has Linked Here

  1. Geneamusings: Best of the Genea-Blogs - 28 July to 3 August, 2013 : Sun, 4 Aug 2013
    A Recipe For GEDCOM by Louis Kessler on the Louis Kessler's Behold Blog. Louis describes how he thinks a replacement for GEDCOM should be modified and implemented.

Leave a Comment

You must login to comment.

Login to participate
  
Register   Lost ID/password?