Login to participate
Register   Lost ID/password?

Louis Kessler’s Behold Blog

#Gaenovium and #Famillement in Tweets - Sat, 11 Oct 2014

I had a wonderful couple of days in Leiden, Netherlands. I live tweeted the two events I attended and included pictures of some of the people I had the pleasure to meet.

Here are the highlights of Gaenovium on Tuesday through some of my tweets and tweets of others:

The first talk was Bob Coret: Open Genealogy Data in The Netherlands.

The second talk was Marijn Schraagen: Algorithms for Historical Record Linkage

The third talk was: Michel Brinckman: The A2A Data Model and its application in WieWasWie.

The fourth talk was: Timo Kracke: GOV: The Genealogical Gazetteer API.

The last talk was mine: Louis Kessler: Reading wrong GEDCOM right.

Then was the panel discussion: Panel Discussion: Current & Future Genealogical Exchange Standards.

And the after-conference meal.

Some final comments.

The next day, Wednesday, there was a family history trade show called Famillement, and many of the Gaenovium participants were there.

A Bit of History as GEDCOM Approaches Its 30th Year - Thu, 18 Sep 2014

I was looking for the early GEDCOM specifications prior to GEDCOM 5.3 from 1993, so I thought I’d contact some people I knew who I thought might have some of the early GEDCOM material laying around. One of those I contacted was Bill Harten, sometimes referred to as “the father of GEDCOM”, who I met and had some nice conversations with at RootsTech 2014 in February.

imageIn our correspondence, Bill told me a story I had never heard that only he would know and it was regarding the genesis of GEDCOM. Bill gave me permission to publish this. Here is what he said:

“GEDCOM’s tag-hierarchy technology was re-purposed from a database technology named AIM that I invented in 1984 to provide both high flexibility and high performance for complex ‘big data."  AIM’s serialized representation was perfect for a data transfer format and GEDCOM needed to handle complex data in a flexible way. FamilySearch used the AIM database technology to distribute their products on compact disc readers and personal computers from 1985 until FamilySearch.org was deployed fifteen years later. You would recognize this technology today as an XML-based DBMS, although AIM includes advanced indexing, search and compression concepts that are still unavailable in commercial DBMS products now 30 years later.

GEDCOM 1.0 was published and delivered to about 30 attendees of the first GEDCOM Developers’ Conference in 1985 in the Family History Library in Salt Lake City. I sent out invitations to developers, media, and genealogical community representatives, who responded enthusiastically. GEDCOM 1.0 was a draft prepared for their review. We incorporated their feedback into GEDCOM 2 which became the first official release. GEDCOM 2 was first implemented by Personal Ancestral File and by Roots, a popular software produced by Howard Nurse. The original publisher of Genealogical Computing magazine took a strong stand in favor of GEDCOM and recommended that readers demand GEDCOM compatibility of any product they purchase. The combination of two initial implementations and favorable media coverage was by design. Their cooperation was secured in advance of the conference. Other vendors were supportive and eventually implemented 2.0 as well, and GEDCOM eventually rose to achieve critical mass.

At this conference I initially proposed that the developers join together under an ISO committee that we would form for the purpose of establishing GEDCOM as a formal international standard. I told them the cost in committee-time and money. They all expressed that the cost was too high given their small size. I anticipated this and had prepared an alternative that I described as a ‘de facto standard’ approach. In this approach the Family History department would commit to implement GEDCOM in PAF and Ancestral File, and share data with whoever would support the format. If others liked it they could follow suit. This approach was adopted unanimously. The list of GEDCOM-compatible products grew rapidly. In the first year a feedback-driven process was established for adding tags and evolving GEDCOM, and I managed this process through several releases, the last being GEDCOM 5. In 1991 I visited software developers in France, Germany, Great Britain and Russia to help them understand and adopt GEDCOM.

We learned that some developers were challenged by GEDCOM’s hierarchical data structure because it did not yield very well to traditional flat-record programming methods. Our AIM database was based on a powerful tree-based data structure and accompanying algorithms for representing, parsing, and traversing hierarchical data internally in memory, so we cloned this code and created the GEDCOM library, and made this available to the GEDCOM community in the public domain. Not all developers used it, but those who did seemed to have a better experience and fewer problems than those who used more traditional approaches. We see this same data structure today in the Document Object Model (DOM) trees that form the backbone of the modern web browser, and many of the names of the DOM structures and methods in browsers and DOM libraries today are the same as we used in the GEDCOM library a decade earlier.”

Let me also refer you to a few posts long ago by Bill Harten on the GEDCOM-L mailing list that explain more of the early thinking about GEDCOM. Very interesting reads:

Bill’s first post on the GEDCOM-L list:
LDS GEDCOM Report, 22 Oct 1994

Although database technology has since improved, the ideas at the time about the database model were both innovative and necessary:
Database: GEDCOM’s Genesis. 28 Oct 1994

Why wasn’t GEDCOM developed through a formal standards organization?
GEDCOM and Formal Standards Organizations. 24 Jan 1996

You have to conclude that there was a lot of thinking and effort put into the development of GEDCOM.

I think this material is a must read for the FHISO people, because to create a new standard today, they’ll have to rehash many of the same problems that Bill & Company had to address.

And if anyone still has any of the early GEDCOM specs prior to 5.3, Bill and I would love to see them.

Saving Web Pages - Wed, 3 Sep 2014

Here’s a quick tip for genealogists or other online researchers.

When you find information of interest on a website, don’t just print it or save a link to the website somewhere. Instead, save the whole webpage complete.

Even if you use a note-taking tool such as OneNote or Evernote, you still may find doing this is better for certain web pages since you may get a more accurate representation of the page.

In Internet Explorer, from the menu select Page –> Save as… (or press Ctrl+S)and set Save as type: Web Archive, single file (*.mht)


In Google Chrome, right-click and select Save as… (or press Ctrl+S) and set Save as type to Webpage, Complete.

In Firefox, right-click and select Save Page As… (or press Ctrl+S) and set Save as type to Web Page, complete (*.htm;*.html)

Chrome and Firefox both save a .htm or .html file for the web page but they also have to create a directory of the same name containing all the files the webpage needs to display properly (e.g. graphics and style sheets).

Internet Explorer can do this as well by setting Save as type to Webpage, complete (*.htm;*.html), but I prefer the Web Archive because it’s cleaner because it has everything in the one .mht file and doesn’t need to create the directory. However, I’m sad that Internet Explorer does not have the option to save the page when you right-click, like Chrome and Firefox do.

I always suggest organizing your source materials by where you got them from, so for websites, make a folder with the website address.

This will give you a complete copy of the web page you have found with all the information and graphics intact. You never know when that page might vanish from cyberspace, and if it is not available, it may be lost forever.

Note, these methods will not save videos and sound files, but will save most everything else.