Login to participate
  
Register   Lost ID/password?

Louis Kessler’s Behold Blog

Genealogy Software is Transforming - Thu, 27 Nov 2014

You likely might not have caught the relevance of the two announcements made on Monday and Tuesday that rocked the genealogy software world.

image

On November 24, MyHeritage announced: Family Historian Genealogy Software Integrates MyHeritage Matching Technologies for Automated Discoveries. And the next day, MyHeritage announced: RootsMagic Adds MyHeritage Matching Technologies for Powerful Automatic Research Capabilities.

These are actually the 2nd and 3rd announcements. Previously on November 13: MyHeritage Bolsters Leadership in the Netherlands with Strategic Partnerships  where it first made agreements with the Dutch program Aldfaer, and Coret Genealogie to integrate MyHeritage’s matching technologies.

These three announcements together are a blockbuster. Not one, but three major programs and one online system (Coret Genealogie’s Genealogie Online) are now providing within-the-program ability to access the MyHeritage collection of family trees and historical records using the matching technologies developed by MyHeritage.

Finally it has begun. Desktop genealogy software (still very necessary as the powerful and private way to assemble one’s own genealogy) is now able to take on some of the functionality (matching data) that was previously only possible online. I see the current announcements as just the first step. I expect many more announcements to come, and not necessarily just by MyHeritage.

This is all possible because of MyHeritage’s foresight in its development of a powerful yet simple API that they have made available. An API is an Application Programming Interface that allows programs like Aldfaer, Family Historian, RootsMagic and Behold to access and perform actions against MyHeritage’s data. FamilySearch and Ancestry have APIs as well, and I’ll talk about them in a minute.

I listened to Uri Gonen’s talk at RootsTech 2014 about the MyHeritage API and came away very impressed. More-so, the one thing that differentiates MyHeritage from Ancestry and FamilySearch is their forthright effort to get software developers to work with them.

When I was at RootsTech 2012, I talked to Mark Olsen of MyHeritage and finding out I was a developer, he immediately wanted to work with me and give me access to MyHeritage and their API. But Behold wasn’t ready for that at that time. At RootsTech 2014, Mark Olsen saw me and even remembered me. He asked how Behold was coming along and said as soon as I’m ready for them to give him a call. Last month at Gaenovium, MyHeritage who was the main sponsor and had two representatives at the event, further expressed their interest in getting me to work with them.

By comparison, with FamilySearch, I talked to many of their people. I discussed software ideas and expressed interest in becoming a FamilySearch partner and accessing their data through their API. Basically they had a ho-hum attitude about this. I had to go after them to get in. If I did, fine. If not, then that was probably fine also. They weren’t out pounding on doors actively seeking new products. But they did have people I could talk to if I wanted and information online on how to get started.

Ancestry, meanwhile, was completely invisible and did not seem to take any interest in me whatsoever. Ancestry gives you this friendly page if you are a developer interested in linking to them. Note that the link on the page doesn’t work. Compare that to say, what 23andMe provides to developers who want to access their API. And not incidentally, MyHeritage announced a strategic collaboration with 23andMe on October 21.

So Ancestry has their TreeSync between their desktop program Family Tree Maker and their online family tree databases. Their just released 2014 Service Pack will hopefully fix that often crashing and poorly working feature. They exclusively use their API in Family Tree Maker and are not making that functionality available to third parties. So they think they can do it alone, exactly the opposite view that MyHeritage takes.

What MyHeritage has done with their recent announcement is that they’ve laid down their gloves, and the battle of the APIs has begun. Who will win?

Company A: Who goes out of their way to find and partner with developers.

Company B: Who lets the developers come to them on their own.

Company C: Who tries to do it all on their own.

If there was someone who may benefit, it may be the FHISO people who are trying to come up with a new genealogy data transfer standard to replace GEDCOM. While they are embraced in an intense discussion on the minutiae making up all the data we might ever want to transfer, these companies are doing relevant data transfer right now through their APIs. One or more of these APIs may take over and become the new “standard”, and FHISO would be saved a lot of work.

Okay, so what about Behold. Is Behold being left behind? Well, as soon as I can get my act together and get the next few releases out, I’ll be able to program Behold to take advantage of these APIs and make this within-the-program access to MyHeritage and FamilySearch possible. Ancestry too, if one day under the pressure of the success of MH and FS, they decide to make their API open.

Behold will be able to do a better job in the presentation of this information. Programs like Aldfaer, Family Historian and RootsMagic provide forms-based input with generated reports. They need to display the MyHeritage match information on their input screens, and must be relegated to using little icons or summarized lines. Behold with it’s report-based Everything Report for both input (coming in 2.0) and output will be able to present this data complete and in-place where you need it and where you can do your editing directly and immediately see your results. I’m so excited that I’m now mad I spent this time writing this blog post when I could have been working on Behold to get it there.

I really commend this move by MyHeritage with Aldfaer, Coret Genealogie, Family Historian and RootsMagic. We’ve been waiting for years for something big to happen in the battle between the big three online databases. This is the big one. FamilySearch and Ancestry will take notice and the genealogy community will benefit from the result of this.

Newly Rediscovered: GEDCOM 4.0 (and a bonus!) - Sun, 16 Nov 2014

Trying to write a flexible GEDCOM reader to read in every flavour of GEDCOM back to the early days is rather difficult when the standards used prior to version 5.3 from 1993 just aren’t available anywhere.

I have scoured the Internet and the Internet archives for that matter, to try to dig up some of the early documents. In particular, I couldn’t find any of these versions: 1.0 (1984), 2.0, 2.3, 2.4, 3.0, 4.0, 4.1, 4.2, 5.0, 5.1 or 5.2 (1992), some of which were said to be “Drafts” and some of which were said to be “Standards”.

A few months ago, I started emailing some of the developers I know who have been using GEDCOM since the early days. I even contacted Bill Harten who led the team developing GEDCOM. He and some of the other developers told me they may still have an old printed copy of it somewhere, and if they weren’t thrown out or eaten by rats, they might possibly be in some old box in their garage or attic that would take months to sort through and find. <Sigh>. It seems that GEDCOM 5.3 was the first version that made it to electronic form. All the other versions were hardcopy … until now.

Through correspondence with Diedrich Hesmer, the developer of Our Family Book and GEDCOM Service Programs, Diedrich contacted the members of his GEDCOM-L list where 24 German speaking genealogy software programmers communicate with each other. It turned out that one of his colleagues, Gisbert Berwe, the author of the program Gen_Plus, found he had a printed copy of GEDCOM 4.0. He scanned it and has now posted it on his website.

You can find Gisbert’s PDF of GEDCOM 4.0 here:
www.genpluswin.de/gedcom/Gedcom_4.pdf

The first 12 pages of this document are not actually the GEDCOM standard, but are the Data Structure Description of the Personal Ancestral File program version 2.1, dated 23 June 1988. I think it is likely that these pages were at the beginning of the document, since PAF and GEDCOM were being developed at the same time by the Family History Department of the LDS. The 12th page appears to be a page Gisbert may have included by mistake, being a German family relationship chart, instead of page 12 out of 12 of the PAF structure guide. But that doesn’t matter because the GEDCOM specs follow.

The GEDCOM Standard 4.0

The GEDCOM 4.0 specifications follow in the next 96 pages. The sections include:

Introduction
Chapter 1: Specification for GEDCOM Level Numbers
Chapter 2: Specification for GEDCOM Tags
Chapter 3: Specification for GEDCOM Transmission Headers and Trailers
Chapter 4: Specification for GEDCOM Cross-Reference Identifiers
Chapter 5: Specification for GEDCOM Values
Chapter 6: Specification for GEDCOM Character Sets
Chapter 7: Specification for GEDCOM Transmission Media
Appendix: GEDCOM Tags

The Introduction is 2 pages. The first page of the introduction is missing from the scan and the second page follows page 1-1. I’ll have to see if Gisbert could scan page In1.  The seven chapters are 35 pages, and the Appendix which contains an alphabetical list of the Tags and their definitions is 47 pages. Gisbert accidentally repeated the cover page and included it where page A-42 should have been.

In my initial scanning of the document, I had to admire the GEDCOM team’s admission that the standard wasn’t perfect. On page In2, they state:

Future Editions of This Document

GEDCOM is still new, and has not yet been exposed to demanding applications over an extended period. Changes will be made as necessary. Chapter five, “Specification for GEDCOM Values,” will be updated to include format definitions for digitized photo, audio, and video information when the need arises and the required specifications have been completed. 

 

Bonus Document! PAF GEDCOM Specifications

I didn’t expect what followed in the next 34 pages.

PAF GEDCOM Specifications 1990

This document is not GEDCOM, but it details the implementation of GEDCOM in PAF Releases 2.0 through 2.2.

It states in its Introduction, Page 3 of 34:

“This document is necessary because many essential details about data structure and the use of tags in GEDCOM are specific to the implementation. In addition, the PAF 2.1 and 2.2 implementations differ from the PAF 2.0 implementation. PAF 2.0 was developed while the GEDCOM standard was still being refined. PAF Release 2.1 and 2.2 conform to the GEDCOM standards formally approved by the Family History Department in October 1987 (GEDCOM Release 3.0) and August 1989 (GEDCOM Release 4.0).”

In other words,
the PAF 2.2 implementation tells us more about GEDCOM 4.0,
the PAF 2.1 implementation tells us about GEDCOM 3.0, and
the PAF 2.0 implementation tells us about GEDCOM 2.0.

Wow! What a find!

Just like all archaeological digs, it will take time to study and analyze the details of these GEDCOM 4.0 and PAF GEDCOM Specifications before the lives of the early GEDCOMonians can be fully understood.

Source-Based Thinking - Tue, 11 Nov 2014

It’s time genealogists stopped their conclusion-based thinking and started going source-based.

Source-Based Document Organization

Source-Based Data Entry

Standardizing Sources

and now

Repository-Based To Do Lists

(Do you think there’s a not-so-subtle theme here?)

I advocate that everything is better if you start with the source. 

Genealogists have for too long been recording their name/event/fact/relationship conclusions first manually onto family group sheets, and now into the similarly organized data entry forms of their genealogy software. If they think of it, and if their software makes it convenient enough, they then just might decide to add a source to it … if they feel like it.

I find it unbelievable to think that people do all this work, and the result is they have no idea what they’ve entered and what they haven’t. Their source materials are a shambles. They can’t find their originals since their filing system for their physical documents are an unorganized mess, as are their computer files. Or even worse – organized by family.

By organizing your documents, computer files, data entry method, to do lists, and everything else by source, suddenly the world opens up. You know where you are and where you are going.

One name and one place studiers have known this for years. They thoroughly analyze all the information they can from every source. They know what they’ve extracted and what they haven’t. Every item of information they have comes first from the source and is entered by source and every item is documented with its source. Their to do lists are the sources they are going to look at. The only thing they lack is good source-based software, because almost all genealogy software is conclusion-based which provides minimal help for them.How Do You Find Your Needles in Haystacks?

Are you looking through a hundred haystacks for one needle? Then are you looking through the hundred haystacks again for the next needle? That’s a lot of work for every needle. For every little fact you need.

Wouldn’t it be better to look through one haystack at a time and find all the needles you can in that haystack? Once that’s done, go to the next haystack. Get everything you can out of every document while you’re accessing that document.

Source-based genealogy. Source-based thinking. It will change everything.

Behold’s getting there.