Login to participate
Register   Lost ID/password?

Louis Kessler’s Behold Blog

What Do You Want _TODO? - Sun, 9 Nov 2014

Do this, do that, do this, do thatWith the talk of a new GEDCOM standard, and my talk about the old GEDCOM standard, one item not yet considered has been To Do lists.

Many genealogists seems to want some sort of method of tracking their goals and the information they plan to find. They feel that keeping track of what they want to accomplish will help them do their research. They want their genealogy software to record this for them. They expect that their program will provide the means they need to keep them focused and on track.

Well, that’s what’s supposed to happen in theory, but it doesn’t seem to always work that way in practise. I’ve researched methods of planning for project management and managerial duties and I’ve looked a various ways to simply keep myself or any person organized and not forget what’s needed to be done. Ideally there should be one scheme that should work right from the big projects down to the simple tasks. Take a look at my past post about Getting Things Done and my next post about Fixing Getting Things Done. I concluded by saying that I thought I had the model to implement a simple but useful To Do list into genealogy software.

That was over 5 years ago. Since then I learned a few things. I found that for me, nothing more complicated than keeping a simple list of things to do worked. Every method other than that worked for a week or a month and then got abandoned for the plain old reliable list. All I really needed to do was make accessing and updating that list simple, and my smart phone turned out to just be perfect for that.

That’s all you need if you only have a few things to do, a simple list. But once your list starts to grow and you have more than 10 or 20 items, it starts to become unwieldy. So you have to divide and conquer and place your items into categories.

What should the categories be? That’s actually obvious when you think about it. You need to subdivide by WHERE you will be doing the task. That way, when you are somewhere, you will have the list of what needs to be done there. People normally segregate this way, and place all the items they’ll buy at their supermarket on a grocery list. All the material needed to build the shed in a corner of the backyard. All the clothes to be washed in a closet. All the items to do on your computer in a huge pile on your desk. And all the items you want to research about your family at a particular website together in one list waiting for when you have some research time for that website.

You do do that last item, don’t you? I’m amazed at how many people don’t. Instead, they organize by person.

Let’s say you want to find your g-g-grandfather’s birth certificate, and his brother’s wife’s name, and their son’s wedding information, and your great-uncle’s immigration record. Let’s write down every other thing we want to find about every person.

Well that’s almost seems ridiculous to me. What you’re creating is a big list of all the things that you don’t know and need evidence for. It doesn’t matter how long you’ve been doing genealogy, you’re are always going to have people you need information about. Even if you know 10 generations back, then you don’t know the 11th. And in-between there is always a lot of information you are unsure of.

So let’s just list every unknown fact for 4,000 people in your tree and put those 20,000 items in a To Do list. That should really help, shouldn’t it?

Bleachh!! This is what most genealogy software that allows adding To Do notes by person is telling us to do. It’s totally the wrong way to do this.

What must be done is to organize your To Dos by where you want to do it. You may know of some records might be at your local archive. There may be some vital statistics you have to write away for at the state office. You may have some information you need to get from Aunt Helen. There may be some online searches you want to do at your computer at a particular website when you have time to do it thoroughly. Or you may want to make a trip to your grandparent’s village and take in everything possible.

It is now clear. Divide and conquer by where the task needs to be done. Attach your To Dos to the WHERE.

In GEDCOM terms, the “where” are the the locations that hold the information sources you are looking for. These are known as the repositories, that are recorded as the REPO record in GEDCOM. Every source has a REPO that you got it from. Every REPO is linked to by all the sources you got from it.

I say the proper means is to attach your To Dos to the repository you will do it at. When you go to that repository, be it the local archive, Aunt May, or the online website, you’ll have the list in front of you what you want to do there. This, in a nutshell, what you need to do to be organized and efficient.

Okay, now I’ll get off my soapbox and just look at what some programs do. I’m not actually going to look at the programs themselves, but I’m going to look at how they export their To Do data into GEDCOM. This gives a good idea of the thinking behind these programs.

First of all, GEDCOM never included any capability to store your objectives, goals, tasks or To Dos. Many developers added this capability to their genealogy software, and then found no official way to export it into GEDCOM.

Some of them created an _TODO tag. This is a user-defined in GEDCOM, with a leading underscore on the tag name. The term “user-defined” is a bit of an oxymoron, because it is the developer, not the user, who is defining this tag. None-the-less, let’s take a look at what they’ve done.

Among my 650 files, I have only 13 with _TODO tags in them. Most of them are from Legacy versions 5.0 and up.

Legacy typically exports its To Dos to GEDCOM that, in their most complex form, look like the following. The TODO tag is subordinate to a INDI record, so it is associated with a To Do for a specific person. Not all the level 2 tags shown below are included in all cases, so it looks like only fields the user fills in are exported:

0 @I1@ INDI

2 DESC Request search for obituary in Paris newspaper
2 _CAT Obituary
2 _LOCL Paris, France
2 DATE Jul 1998
2 _CDATE 6 Aug 1998
2 STAT Completed
2 TYPE 1
2 PRTY 8
2 REPO @R12@
2 NOTE Request search for John’s obituary
2 DATA 6 Aug 1998 - received letter stating that the n
3 CONC ewspapers began publishing after John’s dea
3 CONC th so they were unable to search for an obituary

It’s pretty easy to figure out what each field means here. This really is not that bad an implementation. There’s a date, a completion date, a type that means something program specific, notes, and what I think is most important, a link to the Repository!

What is really sad is that the 13 GEDCOM files I have that use the _TODO tag use it only two or three times in the entire file. If this were something that really was set up in a useful way, you’d think people would use the feature much more. But they don’t seem to.

Displaying this information with the Repository would be useful. But Legacy likely only displays it by person, which IMO it’s pretty useless. This might explain why so few people seem to use the feature.

I have also seen similarly non-trivial implementations of the _TODO tag included in GEDCOMs produced by Ancestral Quest and RootsMagic and Family Origins (the predecessor to RootsMagic). It’s amazing how similar the four implementations look – and that’s not a bad thing, because it potentially allows the possibility to correctly transfer the To Do data between these programs.

I think it is useful for a future GEDCOM replacement to include this sort of information for the researcher. But one change must be made. The ToDo tag should be attached to the repository record and within it link to the person or people (if any) the To Do may be about. It should be of the form:

0 @R12@ REPO

2 DESC Request search for obituary in Paris newspaper
2 _CAT Obituary
2 _LOCL Paris, France
2 DATE Jul 1998
2 _CDATE 6 Aug 1998
2 STAT Completed
2 TYPE 1
2 PRTY 8
2 INDI @I1@
2 NOTE Request search for John’s obituary
2 DATA 6 Aug 1998 - received letter stating that the n
3 CONC ewspapers began publishing after John’s dea
3 CONC th so they were unable to search for an obituary

Doing so would encourage developers to attach the To Do items to the Repository and display them with the Repository information. Then the information would be useful. If this information becomes useful, then people might actually start using it.

Announcing GEDCOM File Finder - Wed, 5 Nov 2014

I am releasing a freeware program I am calling GEDCOM File Finder. You can find it at: www.beholdgenealogy.com/gedcomfilefinder

It’s a nice little program that does just one thing: It finds and classifies all the GEDCOM files (or GEDCOM variants) on your computer. It is simple and only has one screen, allowing you to set the starting directory, filter the filename and include files that contain some desired text. Then it displays important information about the files it finds, and allows you to load any file with your default program that opens .ged files (usually your genealogy software), or let you use your default text editor to view the GEDCOM file directly.

You’ll find it useful if you have more than a few GEDCOMs on your computer.
… or if you forgot where on your computer you put some of them.

You might be surprised and find a few files you didn’t even know you had.

The interface looks like this:


GEDCOM File Finder started off as the Find Files function in Behold. I developed it so that I could easily find some test files to test whatever part of Behold I was working on at the time. With over 650 GEDCOM files on my computer, it wasn’t an easy task to find the best ones for testing. The Find Files function helped.

Important to me was to know specifics about the the different versions of GEDCOM and what made them tick. So I included this information on the right side of the Results Area that you can’t see in the screenshot above. But it looks like this:

The information displayed about the specifics of each program includes new insights that were just recently discovered.

During the build up to the very enjoyable time I had at Gaenovium, I was putting my presentation Reading Wrong GEDCOM Right together. Tamura Jones kindly helped me review and investigate and refine my presentation – I think we went through over 8 drafts prior to presenting it. In so doing Tamura uncovered a lot of previously unknown information about detecting GEDCOM versions. He put this information together in a series of articles on his website, and at the bottom of many of them, he included some best practices. His new articles useful for interpreting GEDCOM files include:

This is some amazing work done by Tamura in a short period of time. So I took it to task and implemented many of the best practices that Tamura recommends in his articles, and that reflects in the information GEDCOM File Finder displays in its Results Area.

Now with GEDCOM File Finder, you’ll have a program that should properly be able to identify as best as possible, the particular species of GEDCOM that your files belong to. And be aware, that your genealogy software may only be able to read something between most of the data (if you’re lucky) and none of the data (if you’re not) depending on what GEDCOM species it is trying to load.

If you have Behold, you’ll find GEDCOM File Finder even more useful. It will also be able to list, find and view Behold Organize Files and Behold Log Files. You don’t have to do anything to implement this. GEDCOM File Finder will detect that you have Behold installed, and will add those features for you.

Enough of this. Why don’t you download GEDCOM File Finder and try it out for yourself. Maybe you’ll find it useful. Freeware. Enjoy!

Download GEDCOM File Finder

Genealogy and Programming: Both Challenging and Fun - Mon, 27 Oct 2014

I’m very lucky, being both a genealogist and a programmer programming genealogy. The two tasks are similar in many respects. You run into problems that are difficult to solve, you need to prove if something is correct or why it isn’t, and some days you make lots of progress whereas other days… well, you can get frustrated, or you can go optimistically forward.

If you’ve been following my Twitter account, you’ll have seen that since I got back from Gaenovium, I’ve committed myself (with the encouragement of a writer friend who is doing the same) to be able to say that I #amprogramming (or for my friend #amwriting) every single day hopefully for 365 days straight. This is an excellent motivational technique, and once you’re on a roll you really get rolling! You see the progress and you look forward to the next day’s work, without having forgotten what you have done before. It’s not full time, but a minimum of an hour or two per day of committed effort. I’m up to day 15 and still going strong.

Coming from a statistics/mathematics/computer science background, some of my most challenging and fun problems in programming are dealing with data structures. It’s not often I talk about really technical programming details in this blog, but this one seems just right, so please bear with me.

Behold reads files written in GEDCOM format. Many years ago, too long ago to remember, I programmed a linked list to represent the family/individual connections in Behold. I created a data structure for this that I call my IndiFam list. Every “family” (father/mother) is connected to the oldest child, who is connected to the next oldest child, until you finally reach the youngest child. I had programmed this oldest to youngest scheme because GEDCOM stated that “the preferred order of the CHILdren pointers within a FAMily structure is chronological by birth.” Most developers followed this suggestion. So little did I expect that I’d have to address this very basic bit of programming years after I first programmed it.

Since the recent surprising discovery of GEDCOM 2.0 files, I’ve run across several dozen of these files to play with. I wanted to make sure Behold could read them properly. Some of these are especially challenging for a genealogy program to read. So on day 13 of #amprogramming, I took to this task.

Most of the programming of GEDCOM 2.0 was straightforward. But there was a real challenge in the GEDCOM 2.0 method of linking children.

In GEDCOM 2.0, the family linked to the youngest child:

0 @1@ FAMI
2 @5@ YOUN

and then each individual linked to their older sibling:

0 @5@ INDI
2 OLD @6@

The developers of the standard (the Family History Department of the LDS) realized this was a problem for programmers and for GEDCOM versions 2.1 and later, changed it to simply list the children in order oldest to youngest within the FAMily record, like this:

0 @F1@ FAM
1 CHIL @6@
1 CHIL @5@

But I was still left with a challenge if I wanted Behold to read these GEDCOM 2.0 files.

Solving this sort of programming puzzle is never obvious. It often takes one or two stabs to first remember what I had put into my original data structures so many years ago, and then a few more attempts to figure out how to implement the change.

After several code writes, I had one that likely could have worked, but it would have been a mess of complicated code difficult to maintain or understand. That didn’t sit too well with me, and my mind started mulling over it. While getting ready for bed, I came up with what seemed to be a logical understandable method and ran to my office and scrawled the following diagram:


Okay, so maybe that’s logical and understandable to me, but not you. Never-the-less, using this I was able to produce some mighty clean code that I had working after less than an hour of coding and testing.

I don’t think I’ve ever shown any of the actual code from Behold in my blog before, but it’s an interesting illustration of what programmer-speak is and what the Object Pascal language used in Delphi looks like:


I won’t subject you to what all that means. All-in-all, Behold calls this routine once for every set of INDI.SIBL.OLD tags in a GEDCOM 2.0 file, and the code will reorder the siblings in the linked list structure of children.

Maybe that’s not exciting to you, but for a programmer, getting a “eureka” with a few scrawls on a piece of paper and coding it up in less than an hour is like finding that clue that finally identifies who your great-great-grandmother was.

I’ve still got a few things left in the bucket to get Version 1.1 of Behold out. But with my newfound motivation to ensure I #amprogramming every day, each day will result in a bit of progress with the goal of allowing me to release 1.1 sometime in November.

Now this blog post is out and I’ve got to get back to #amprogramming.

Have you done your #amfamilyresearching for today?