Login to participate
Register   Lost ID/password?

Louis Kessler's Behold Blog

GEDCOM 5.0 Rediscovered - 6 hrs, 51 min ago

Further adventures in the attempt to rediscover the long lost early GEDCOM standards have once again come up with something significant.

You may or may not remember that just over a year ago, I started this venture:

  1. Newly Rediscovered: GEDCOM 4.0 (and a bonus!)
  2. More GEDCOM Archaelogical Discoveries

Well, I never did get the copies of GEDCOM 3.0 or 5.0 from Brian Madsen. I did email him back a couple of times without a response, so I decided not to bother him any more.

But a couple of weeks ago, I had noticed that there is still activity on the mailing list for LINES-L, which was a mailing list for Tom Wetmore’s LifeLines program. That mailing list has been around for a long time, and I realized the people on the list would have been technical genealogical old-timers, people who might have seen some of those early GEDCOM standards.

So a week ago I posted a query on the LINES-L list. I got a response from Peter Glassenbury of New Zealand who sent me GEDCOM 5.0, and is the person to be credited as the “retriever” of this relic.

What he sent me was this text file of GEDCOM 5.0. I’m sure Tamura Jones will add this to his FamilySearch GEDCOM Specifications page so that it will be available to everyone.

A comparison of this version 5.0 text file with the versoin 5.3 text file is very interesting. You can see that the GEDCOM was evolving from version to version:


There was a 5.1 and 5.2 (both were drafts) in-between 5.0 and 5.3 (which were also both drafts). Obtaining 5.1 and 5.2 would allow seeing the changes in smaller steps and that would make the conceptual differences between versions more understandable. Anyone who works on a future replacement of GEDCOM should study these concepts so to understand the well-thought out reasoning why some constructs were added and others were removed from each version.

We’re still looking for the following GEDCOM versions:

1.0 (1984)
2.0 (Dec 1985) PAF 2.0
2.3 Draft (7 August 1985) with PAF 2.0 GEDCOM implementation
2.4 Draft (13 December 1985) with PAF2.0 GEDCOM implementation
2.1 (Feb 1987) GEDCOM for PAF 2.1
3.0 Standard (9 October 1987)
PAF 2.0 and 2.1 implementation of 3.0 (8 June 1988)
4.1 Draft
4.2 Draft (25 January 1990)
5.1 Draft (18 September 1992)
5.2 Draft (2 June 1993)

If anyone has any of the above, we would really appreciate if you could scan them and contribute them for the main purpose of preservation of the history of GEDCOM and to allow the study of the development of GEDCOM.

RootsTech 2016 Random Thoughts - 13 hrs, 56 min ago

Although I’m not at #RootsTech this year, I’ve found this the easiest one yet to feel a part of from afar. The live streaming is better than ever, and the Twitter feeds I follow include dozens of people in attendance who are doing a great job with live updates and links to blog posts about the event.

On Thursday night, I found I had to tweet this:

At 26,000 attendees (I hear it might get up as high as 30,000), they’ve really upped their in-person attendance. Last year they reported 20,000. When I was last there in 2014, it was 13,000. Now it’s true that many of those are for the Saturday Family Discovery day, but none-the-less, the total number has doubled in 2 years. This event has definitely made it’s mark and become a significant annual event for genealogists and family historians.

A lot of people are not so high on the RootsTech’s emphasis on family stories, stating that this is not really genealogy. But I personally don’t mind this emphasis, since I agree that stories are very important and it could be a good way to get people interested in genealogy who otherwise would not be.

As a developer, I am very interested in the technical sessions. I was so pleased to see that in additional to the Innovator’s Summit the day before RootsTech, they have moved the annual BYU Family History Technology Workshop to the day before the Innovator’s Summit. They had what looked like a marvellous program for the day that was blogged about by James Tanner. I’ve met and conversed with many of these young technologists in web conversations and I’ll be looking forward to being able to go combine this event with a future RootsTech that I attend.

The Innovator’s Summit sounded interesting. There’s always a few sessions there I want to attend. But talking directly with other developers on that day is what I really enjoy, especially if it’s like the morning get-together we had for breakfast in 2014.

The “big” highlight for RootsTech seems to have become the Innovator Showdown. There were 46 applicants vying for $100,000 in cash and prizes. Some of tweets I was following by @TamuraJones was asking where the “innovation” in these apps are, and @susankitchens  was trying to track down past finalists to see what happened to them. One of the most amusing and true tweets about the session with the finalists on stage being asked questions by the judges was:

I’ll be seeing Judy Russell, the Legal Genealogist, over the next few weeks. I was there to hear Judy as one of the keynotes at RootsTech 2014. This year, Judy was one of the judges for the Innovator Showdown and I’ll be very interested to hear her opinions about that event and about the software submitted. I had submitted Behold as an entrant in the 2012 contest, then called the Developer Challenge. I had some sour grapes about that back then. But it did allow me as a newbie at RootsTech to get on a panel and demo Behold and meet a lot of people and make a lot of friendships.

Thursday and Friday at RootsTech are always made up of too-many-sessions-you-want-to-see-that-you-can’t-because-they-are-all-at-the-same-time and too-many-vendors-in-the-exhibit-hall-to-see-because-there-just-isn’t-enough-time-available-in-two-days – but picking and choosing and doing the best you can is always great.

What I really enjoy is seeing people, especially technical people that I have previously met or corresponded with, follow on Twitter or Google Plus and talking with them. Next best, when not at RootsTech is to watch interviews, such as the “ambush cams” done by Dear Myrtle like his one with Tony Proctor or this interview by Jill Ball (Geniaus) with Judy Russell.

I’m writing this while watching the Saturday morning General Session live streamed. Saturday is always such a sad day at RootsTech because it is the last. Time to say goodbyes and try to pull yourself together and comprehend the enormity of the experience for the last few days.

Because of my preparations for the 10th Unlock the Past Cruise next week, I didn’t think it prudent to go to RootsTech this year. But counting on good luck and good planning, I’ve already etched February 8 to 11, 2017 into my calendar.

By the way, make a point to download any of the Syllabi you may want to now, because they usually get taken down shortly after RootsTech ends. There are no descriptions for them on the Syllabi page, so use the Session Guide to find the ones you may be interested in.

How to Program Dates for Genealogy - 5 days, 7 hrs ago

Dates in genealogy are messy. Four years ago, I wrote a series of posts on some of the aspects of dates as defined by GEDCOM and some of the bad dates you see in GEDCOM files in the wild:

Sort of a Date
How About a Date?
Out on a Bad Date

In a recent post by on the RootsDev Google Group, Gary Stanley asked about Incomplete Dates: “Does anyone have any recommendations for working with incomplete dates and storing them within a database such as MySQL?”

For the record, I thought I’d lay out the method I decided on. It is based on GEDCOM’s date definition which as I’ve said in the above articles, I think is quite a good specification.

The good thing about the GEDCOM spec is that it is readable. The bad thing is that it does not sort, so an alternative internal format is needed to sort. So here is what I came up with and use internally in Behold. It is also how I’ll export dates in the database that Behold will produce.

The structure is a 12 or 24 character string (24 if it’s a date range) that can be sorted just by letting the computer naturally sort the strings.

Behold’s internal date format is:

C (Calendar): ‘/’ = Gregorian, ‘A’ = Julian, ‘F’ = ‘French’, ‘H’ = ‘Hebrew’
B (B.C.): ‘1′ = B.C., ‘2′ = A.D. If B.C. 
         If BC, then the YYYY is set to 9999 - YYYY so it sorts correctly
YYYY (Year)
M (Month): ‘1′ .. ‘C’ for JAN to DEC
                 ‘D’ .. ‘O’ for VEND to COMP
                 ‘P’ .. ‘Z’ for TSH to ELL
DD (Day)
* (Date Modifier): 1 = BEF, 2 = TO, 3 = (none), 4 = ABT, 5 = CAL, 5 = EST,
                         7 = INT, 8 = AND, 9 = FROM, A = BET, B = AFT
AA (Alternate year): e.g. the "93" in 1592/93
CBYYYYMM*AA = the second date in a date range (only if needed)

e.g. 4 Nov 1900 = ‘/21900A043  ’

If column 1 is a ‘(‘, then this is a GEDCOM date phrase stored as text between parenthesis, e.g.: (4 days old). Anything that does not fit the standard format automatically becomes a date phrase.

With regards to “incomplete dates’, GEDCOM allows year only, or month and year only. It does not allow month without year or day without month and year. My representation follows this idea and allows ‘/21900A003” (no day) and ‘/219000003’ (no month and day). Other types of incomplete dates will become a date phrase, e.g. ‘(14 Nov)’

I developed this structure before I discovered that RootsMagic uses something similar. RM uses several different formats of dates, but I think it’s worthwhile comparing the RootsMagic text representation of a date as described at http://sqlitetoolsforrootsmagic.wikispaces.com/Date+Formats

Their structure is an always 24 character string, as follows:

RootsMagic Text Dates:

C (Calendar): D = Standard date, Q = Quaker date, T = Text date 
* (Date Modifier): - = NONE, A = After, B=Bef, F=From, I=Since,
                         O = Or, R = Bet/And, S= From/To, T=To,
                         U = Until, Y = By
B (B.C.): ‘–’ = B.C., ‘+’ = A.D.
YYYY (year)  - can be 0000 if for partial date with no year e.g. Jan 1
MM (month) – can be 00 for partial date
DD (day) – can be 00 for partial date
A (Alternate year): ‘/’ = Double Date, ‘.’ = Otherwise
% (Surety): ? = Maybe, 1 = Prhps, 2 = Appar, 3 = Lkly, 4 = Poss,
        5 = Prob, 6 = Cert, A = Abt, C = Ca, E = Est, L = Calc, S = Say, . = other
BYYYYMMDDA% = the second date in a date range (always included)

e.g. 4 Nov 1900 = ‘D.+19001104..+00000000..’

This is interesting how similar, yet different my system is from RootsMagic. I’ve never seen a Quaker date. RM doesn’t seem to support the Julian and French and Hebrew dates that are in GEDCOM. Their modifier at the beginning of the string will prevent their dates from sorting properly. They include a surety which isn’t part of the GEDCOM date so I don’t include it. And they are always 24 characters, where my format is usually 12 but 24 if it is a date range. 

My one take out of this is that maybe I could save a character for my double dates by using a single character code instead of the two digits.

Regarding the “Sort Dates” that many programs (RootsMagic included) make you enter for events where you don’t know the date, but still want them sorted in a particular order. I think they are superfluous.

My feeling is that if you know enough about the date to be able to order it, then put in what you know using the date modifiers, e.g. if Mary was born in 1832 and John was born after Mary, then I’d say you should put the date for John’s birth in as “AFT 1832” and John will (at least in Behold) be sorted properly after Mary. And make sure you add an appropriate comment and source onto the birth event to state how you know this.