Login to participate
  
Register   Lost ID/password?

Louis Kessler’s Behold Blog

Genealogy and Programming: Both Challenging and Fun - Mon, 27 Oct 2014

I’m very lucky, being both a genealogist and a programmer programming genealogy. The two tasks are similar in many respects. You run into problems that are difficult to solve, you need to prove if something is correct or why it isn’t, and some days you make lots of progress whereas other days… well, you can get frustrated, or you can go optimistically forward.

If you’ve been following my Twitter account, you’ll have seen that since I got back from Gaenovium, I’ve committed myself (with the encouragement of a writer friend who is doing the same) to be able to say that I #amprogramming (or for my friend #amwriting) every single day hopefully for 365 days straight. This is an excellent motivational technique, and once you’re on a roll you really get rolling! You see the progress and you look forward to the next day’s work, without having forgotten what you have done before. It’s not full time, but a minimum of an hour or two per day of committed effort. I’m up to day 15 and still going strong.

Coming from a statistics/mathematics/computer science background, some of my most challenging and fun problems in programming are dealing with data structures. It’s not often I talk about really technical programming details in this blog, but this one seems just right, so please bear with me.

Behold reads files written in GEDCOM format. Many years ago, too long ago to remember, I programmed a linked list to represent the family/individual connections in Behold. I created a data structure for this that I call my IndiFam list. Every “family” (father/mother) is connected to the oldest child, who is connected to the next oldest child, until you finally reach the youngest child. I had programmed this oldest to youngest scheme because GEDCOM stated that “the preferred order of the CHILdren pointers within a FAMily structure is chronological by birth.” Most developers followed this suggestion. So little did I expect that I’d have to address this very basic bit of programming years after I first programmed it.

Since the recent surprising discovery of GEDCOM 2.0 files, I’ve run across several dozen of these files to play with. I wanted to make sure Behold could read them properly. Some of these are especially challenging for a genealogy program to read. So on day 13 of #amprogramming, I took to this task.

Most of the programming of GEDCOM 2.0 was straightforward. But there was a real challenge in the GEDCOM 2.0 method of linking children.

In GEDCOM 2.0, the family linked to the youngest child:

0 @1@ FAMI
1 CHIL
2 @5@ YOUN

and then each individual linked to their older sibling:

0 @5@ INDI
1 SIBL
2 OLD @6@

The developers of the standard (the Family History Department of the LDS) realized this was a problem for programmers and for GEDCOM versions 2.1 and later, changed it to simply list the children in order oldest to youngest within the FAMily record, like this:

0 @F1@ FAM
1 CHIL @6@
1 CHIL @5@

But I was still left with a challenge if I wanted Behold to read these GEDCOM 2.0 files.

Solving this sort of programming puzzle is never obvious. It often takes one or two stabs to first remember what I had put into my original data structures so many years ago, and then a few more attempts to figure out how to implement the change.

After several code writes, I had one that likely could have worked, but it would have been a mess of complicated code difficult to maintain or understand. That didn’t sit too well with me, and my mind started mulling over it. While getting ready for bed, I came up with what seemed to be a logical understandable method and ran to my office and scrawled the following diagram:

image

Okay, so maybe that’s logical and understandable to me, but not you. Never-the-less, using this I was able to produce some mighty clean code that I had working after less than an hour of coding and testing.

I don’t think I’ve ever shown any of the actual code from Behold in my blog before, but it’s an interesting illustration of what programmer-speak is and what the Object Pascal language used in Delphi looks like:

image

I won’t subject you to what all that means. All-in-all, Behold calls this routine once for every set of INDI.SIBL.OLD tags in a GEDCOM 2.0 file, and the code will reorder the siblings in the linked list structure of children.

Maybe that’s not exciting to you, but for a programmer, getting a “eureka” with a few scrawls on a piece of paper and coding it up in less than an hour is like finding that clue that finally identifies who your great-great-grandmother was.

I’ve still got a few things left in the bucket to get Version 1.1 of Behold out. But with my newfound motivation to ensure I #amprogramming every day, each day will result in a bit of progress with the goal of allowing me to release 1.1 sometime in November.

Now this blog post is out and I’ve got to get back to #amprogramming.

Have you done your #amfamilyresearching for today?

Gaenovium Final Thoughts - Sun, 12 Oct 2014

I had a whirlwind 3 days in Leiden Netherlands to attend the one day Gaenovium genealogy technology conference.

Gaenovium 2014 was the first time this conference was put on. It was intended to be small with a highly technical audience interested in getting together to discuss aspects of genealogy programming and future genealogy data exchange standards. There were 25 people in attendance and that was really a wonderful size because it gave a chance for every person to talk at some time with every other person.

WP_20141007_004

I gave a presentation called “Reading Wrong GEDCOM Right”, and I thought there would be people who really wouldn’t care that much about this topic. I asked at the beginning of my presentation how many people were familiar with GEDCOM and 90% of the people put up their hand. I then asked how many had actually had programmed GEDCOM input and/or output themself, and half the people kept their hand up. I was quite encouraged by this. After talking for about 20 minutes, I paused and just looked at everyone. They were all silent and looking at me intently. No one seemed disinterested. And so I continued. My presentation was very well received.

My presentation and all the other presentations are now posted at the Gaenovium site.

All the presentations were interesting to me, but meeting the people there was the most enjoyable and worthwhile venture. I’m purposely avoiding naming anyone because I don’t want to leave someone out.

The conference was sponsored primarily by MyHeritage, and secondly by RootsMagic. I was happy to see two people from MyHeritage in attendance. One was the one employee they have in the Netherlands, and the second was the Chief Architect (I love that title, maybe I should call myself the Chief Architect of Behold) who was in from Israel for this. They were both interested in everything said at Gaenovium and were very friendly with the people who were there. I had some worthwhile discussions with both of them about Behold and maybe future connections between Behold and the MyHeritage online family tree. I was a bit disappointed that nobody from RootsMagic had come. It would have been nice to hook up with them as well.

One of the very thoughtful things about the planning of Gaenovium was that they chose the date as the day before a Family History trade show was taking place in the city. This was done on purpose, because some of the attendees also had booths at the trade show and this made it convenient for them to also attend Gaenovium. For me it was the other way around, and it made it convenient for me to visit the trade show. The show was called Famillement, and had over 70 exhibitors with the French company Geneanet being the sponsor. The show was well attended and at times it was impossible to move through the crowd.

WP_20141008_212 WP_20141008_209
      Above:  Timo Kracke at Famillement

      Left:  Bob Coret at Famillement

For me, it was also worthwhile to visit Leiden and get to spend time with Tamura Jones, who I’ve been communicating with for many years. He was a wonderful host, and the personal tours he gave to me of the city were much appreciated.

Some of the participants were already talking of a Gaenovium 2015 a year from now, possibly somewhere in Germany. If you heard about this year’s Gaenovium too late to make arrangements to attend, then stay tuned to see if something gets announced for 2015.

#Gaenovium and #Famillement in Tweets - Sat, 11 Oct 2014

I had a wonderful couple of days in Leiden, Netherlands. I live tweeted the two events I attended and included pictures of some of the people I had the pleasure to meet.

Here are the highlights of Gaenovium on Tuesday through some of my tweets and tweets of others:

 
The first talk was Bob Coret: Open Genealogy Data in The Netherlands.

 
The second talk was Marijn Schraagen: Algorithms for Historical Record Linkage

 
The third talk was: Michel Brinckman: The A2A Data Model and its application in WieWasWie.

 
The fourth talk was: Timo Kracke: GOV: The Genealogical Gazetteer API.

 
The last talk was mine: Louis Kessler: Reading wrong GEDCOM right.

 
Then was the panel discussion: Panel Discussion: Current & Future Genealogical Exchange Standards.

 
And the after-conference meal.

 
Some final comments.

 
The next day, Wednesday, there was a family history trade show called Famillement, and many of the Gaenovium participants were there.