A month ago, I blogged about the BetterGEDCOM endeavor. In the ensuing month, I’ve gotten involved and added my two cents worth. I find I’m mostly a lone wolf in the woods.
Up until a few days ago, I didn’t know why my ideas were so different from everyone else. Almost all of them are of the opinion that GEDCOM standard has major shortcomings and needs to be replaced. I differed in my view, and as I said a month ago, think it needs to be brought up-to-date (via XML and Unicode), and maybe could use a few improvements. I’m talking about tweaking - not a major overhaul, An evolution and not a rewrite.
But I think I found out what’s going on and why the opinion is that GEDCOM is bad. Over at the BetterGEDCOM blog, a blog site run by some of the people who initiated the BetterGEDCOM initiative, they’ve been running GEDCOM tests. They would take one program, say Family Tree, enter some data into it, export it to a GEDCOM file, import that GEDCOM into a 2nd program, say RootsMagic, and then see that the 2nd program doesn’t show it in the same way. They’ve done many tests, for example their latest test was done with address information. Their conclusion was: “Looking at the GEDCOM 5.5 it appears that the format is correct. However, how and where this entry should be, may be the problem in the GEDCOM Standard.”
Now I give them a lot of credit for doing these tests to see how programs differ, but by highlighting how poorly the programs have implemented GEDCOM, they are spreading the illusion that GEDCOM is to blame.
My experience with GEDCOM is very different. In the development of Behold, I’ve tried to make Behold as flexible a reader as possible. I want Behold to display the data from any GEDCOM that was created from any version of any genealogy program, as accurately as possible. Doing so, I had to study various parts of GEDCOM in detail, and try to interpret how each program was exporting their data in terms of the GEDCOM standard.
Almost all programs do something different somewhere. They may add extra tags in a standard way, or in their own non-standard way. They may use some GEDCOM constructs a bit incorrectly, or they may abuse them totally. Generally the programs come close to the GEDCOM standard, but it is rare to see a perfect GEDCOM meeting all rules of the standard.
One good example is the CONC (Concatenate tag). The GEDCOM standard says the word on the line preceeding the CONC tag must be split so that some letters of the word is on the preceeding line and some are at the start of the CONC line. Then the program reading the GEDCOM should paste the two lines together with no spaces. But some programs export this wrong and the line before their CONC has a complete word and the CONC tag line starts with the next word. What this means is that the program reading the GEDCOM needs to put a space between the two words when it puts the lines together. Now look what this does to me, the beleaguered programmer. First I have to program this for both cases, the correct and the incorrect program. If I assume the program is correct but it isn’t, words will be plastered together. If I assume the program is incorrect but it is, there will be spaces where there shouldn’t be. This poor programmer now has to maintain a table of incorrect programs (or put in some fancy algorithm to try to detect how it handles CONC - and even this is subject to error). The bottom line is that every one of these misinterpretations of GEDCOM is a lot of extra work to handle, and it needn’t be so if every program just followed the standard.
The implication is that because programs are not following the standard, then there must be something wrong with the standard that makes it difficult for the programs to follow them. Therefore this must be corrected. Well, to me, that’s hogwash. In this case there is nothing wrong with the standard. The problem is with the programs and the programmers who aren’t diligent enough to follow the correct standard in the first place, and if it is pointed out that they are doing it wrong, aren’t accountable enough to change their program so they do it right.
Okay. So that’s case in point #1. Maybe they need to learn GEDCOMBetter, rather than get a BetterGEDCOM. With their current work ethics, they’ll just implement that BetterGEDCOM incorrectly as well, and the result will be that BetterGEDCOM is no better than the GEDCOM it was intended to be better than. (Whew, that’s a mouthful!)
But the 2nd point is maybe more important than the first. That is that there is a misconception that GEDCOM has major shortcomings. Now I’m not saying it’s perfect, and as I keep saying, I’d like to see a few changes to it myself, but check what the BetterGEDCOM initiative has decided is wrong with GEDCOM.
They have a “GEDCOM Messes This Up” section and mention sources, citations and certainty assessment. In my observations, GEDCOM has a very advanced facility for sources and citations and a wonderfully simple and usable certainty assessment. GEDCOM doesn’t mess it up. The programs that implement it incorrectly mess it up.
Then there’s a “GEDCOM Won’t Transfer This” section. Sources are again mentioned, but Behold has no trouble reading and displaying them from a multitude of programs. They transfer fine for me. I’ve seen very exhaustive and detailed sourcing and ensured that Behold can display it all correctly. It’s not my fault or GEDCOM’s fault that some programs don’t export their sources. With regards to places and documents/images, yes GEDCOM could use some tweaks to support those better. But these are tweaks in a natural evolution of GEDCOM, and not something deserving a total rewrite.
“I Want My Genealogy Software And BetterGEDCOM To Do This: Handle evidence and not just conclusions. Do conclusion chaining. Round-trip the data, (and a number of other things).” These are all things up to the software programmer to implement. Current GEDCOM can handle it just fine. This doesn’t need a BetterGEDCOM for it to be done.
But what I’ve found is that GEDCOM itself is not understood well, and unfortunately not well by those attempting to change it. I’m not sure how you build a “better” house, if you don’t know how good the house was that you had before.
I have come to respect GEDCOM and the people at LDS who developed it. It was quite a major work effort spanning many years and many versions. They had to do a lot of deep thinking about what was wanted, and then implement it in some logical way. When you look at some of the details of GEDCOM, you see some very advanced capabilities, but many of these have been seldom or rarely used by genealogy programs. This is partly because GEDCOM evolved and matured faster than programmers could handle it - and then it stopped at version 5.5, the unofficial 5.5.1 and the Draft XML 6.0.
Was it a good standard? Absolutely. It allows data transfer between programs. And you know what happened in the ensuing years. Just about every single program adopted GEDCOM as their import and export mechanism. I’m trying to think of other standards that 99.9% of an industry have adopted. There aren’t very many. To me that makes it more than a good standard. That makes it a great standard. The measurable goal of what makes a standard great is how many people use it.
Some people even started using it as a data store. I remember decades ago when Cliff Manis used Tom Wetmore’s Lifelines program and created the GenServ System that still exists and collects names. Cliff, if you’re still out there, do you remember when I visited you at your home when you had first started it up in the early 90’s? That was me. You were a very gracious host. So here we had a system of data storage, built purely out of GEDCOM, that preceeded the Geni and MyHeritage and OneGreatFamily’s of the world. We had and have a GEDCOM standard that can be used for data transfer and for data archive and retrieval.
Again, is it perfect? No it’s not perfect. It has aged after 15 years and needs a few tweaks.
What it doesn’t need is a rewrite.
Maybe what’s really needed is an education program. So that developers will be able to study and learn what treasures are really hidden in the old GEDCOM standard. So that they’ll be able to learn how to implement the features correctly. And so they won’t go off trying to rebuild from scratch the rooms in the house that are perfectly fine.