Login to participate
  
       
Register   Lost ID/password?
Louis Kessler’s Behold Blog » Blog Entry           prev Prev   Next next

Multiple Events and Unions in GEDCOM - Sun, 24 Mar 2013

I have to make a decision on how to implement some aspects of the Life Events that are going into Behold. I’ve run upon a couple of places where GEDCOM isn’t clear. I thought I’d document them here for the benefit of other developers and for future consideration in any future standard.

The lines in GEDCOM 5.5.1 under consideration are these:

The order in which GEDCOM lines are written to a GEDCOM file is controlled by the context and level number. When the lines are of equal level number but have a different tag name then the order is not significant. The occurrence of equal level numbers and equal tags within the same context imply that multiple opinions or multiple values of the data exist. The significance of the order in these cases is interpreted as the submitter’s preference. The most preferred value being the first with the least preferred data listed in subsequent lines by order of decreasing preference. For example, a researcher who discovers conflicting evidence about a person’s birth event would list the most credible information first and the least credible or least preferred items last.

Systems that support multiple fields or structures should allow their users to indicate their information (the first occurrence listed) and store the remaining information as an exception, preferably within an appropriate NOTE field or in some way that the patron has ready access to the less-preferred data when viewing the record.

Conflicting event dates and places should be represented by placing them in separate event structures with appropriate source citations rather than by placing them under the same enclosing event.

What this means is that if you have two conflicting sets of information for an event, such as a birth event, then there should be separate event structures for them, e.g.:

1 BIRT
2 DATE 1880
1 BIRT
2 DATE 1870

Presumably you’d have more information with each including the full dates, the places, your sources and notes about each bit of evidence. Because of the GEDCOM rule, the first of the two would be considered the preferred, i.e. most credible date.

This is all fine and good for events like Birth and Death that, other than extremely extended circumstances (e.g. brought back from a coma, or science fiction), normally occur only once in any person’s life.

The trouble is that almost any other event can occur multiple times in a person’s life: adoption, naturalization, census, education, retirement. There have been people who have had multiple baptisms and even multiple burials.

This results in a problem. For events other than Birth and Death, if the events are represented like the 4-line GEDCOM example above, how do you tell if they are two different events of the same type, or if they are two sets of conflicting information about the same event?

The answer is, you can’t. GEDCOM does not explain how to distinguish the difference.

Therefore, in Behold, I’ll be going out on the limb and assuming multiple events listed are different events, except in the case of births and deaths where they’ll represent conflicting information for the same event.

You can see some of my earlier discussion about this when I was first considering the ordering of events.

That got me to wonder about marriages. The marriage event is considered to be the other “important” event in pure genealogy (not family history) research. Surely, people will want to show conflicting sets of marriage information.

Using the reasoning above, a person may get married multiple times. So a marriage is not a unique event in a person’s life. And the above reasoning says you cannot distinguish whether two MARR tags refer to one marriage or to different marriages.

But marriages are different because they are not associated with a individual (INDI) record. Instead they are associated with two people via a FAM record.

The FAM record was unfortunately misnamed as people always refer to it as a family of a husband and wife and their children. I’ll leave the definition of a “family” for some other discussion. Instead I’ll refer to what GEDCOM says about the FAM record:

The FAMily record is used to record marriages, common law marriages, and family unions caused by two people becoming the parents of a child. There can be no more than one HUSB/father and one WIFE/mother listed in each FAM_RECORD. If, for example, a man participated in more than one family union, then he would appear in more than one FAM_RECORD. The family record structure assumes that the HUSB/father is male and WIFE/mother is female.

So a FAM record is really a union of two people becoming the parents of a child. Whether or not they get married is irrelevant. Whether or not there is no more than one male and one female parent is technically irrelevant (there can be same-sex couples adopting a child) but GEDCOM requires it (Boooo).

By this definition, there can be one FAM record for every child. It’s a mosh-posh because events about the two parents (engagement, residence, census, divorce) are included with the FAM record. If there was a FAM record for every child, the two parents would be repeated multiple times and which one would you assign their combined events to? So the general implementation adopted by most developers is that one union with all their children would be put together in a single FAM record.

Now my question: What happens if two people break up and then get together again? There may be a second marriage event for the same two people. Does that go into one FAM record, or do you create two FAM records with it?

This is a key question. If you create one FAM record, then there can be more than one marriage event in the record and you will no longer be able to tell conflicting marriage information apart from two separate marriage events. But if they go in two FAM records, then there will be only one marriage event per FAM record.

What about the children from the remarriage? The children will be full brothers and sisters with those from the first marriage since they have the same parents. That reasoning would suggest just having one FAM record.

But there may be an intervening marriage with someone else in between, which could also result in children. Should there then be 3 FAM records or only 2.

The answer again is that again GEDCOM doesn’t help you and tell you how to do this. So the developers all had to decide for themselves.

Tamura Jones suggested to me that I test this out myself with some of the more popular genealogy programs and see what they do. Do they accept the data either way? Do they then output the data the same way? Maybe I can generalize a “best practice” from this.

So I built a tricky little 84 line GEDCOM file that I’ll try on several programs. It’s got Guy Main who marries Gal One and has MarriageOne Child. They divorce and he marries Gal Two and has MarriageTwo Child. The end of their marriage isn’t described but he then remarries his first wife Gal One again and they have MarriageThree Child. Just for fun, each of the three families has two marriage dates, a preferred date and an alternate date, and the one divorce also has a preferred date and an alternate date. Onto Guy Main, I’ve also added a preferred and alternate birth date and two census dates that shouldn’t be treated as preferred and alternate, but as two separate events.

Hmm. In setting this up, I ran it through Behold. Behold finds no problems with the GEDCOM file. However, I noticed some things the most recent version of Behold does wrong in displaying this file that’s already been fixed in my development version.

For a second check, I ran the GEDCOM through Tim Forsythe’s VGedX program. It complains only about too few ADDR tags, but I think that is wrong on VGedX’s part. And I ran it through Tim Forsythe’s Bonkers program, which says there’s no problems.

Now let’s see how some other programs handle it:

1. PAF 5.1.7.0

No errors reported on import.

- Treats the first birth date and first marriage date as the preferred one.
- Lists 3 spouses. (That’s when you click an “Other Marriages” button that is misnamed and should say “All Marriages”).
- Guy Main lists 3 other events: The alternate birth and the two census events.
- Notes are shown together for Guy Main, rather than on each event. Bad!
- Gal One marriage has alternate marriage and 2 divorces but dates are missing.
- Notes are shown together for marriage, rather than on each event. Bad!
- Exported marriages correctly.
- Exported preferred marriage date only. It did not export the divorce date, but somehow got it messed up because of the notes added to the marriage and divorce events. Bad!

Attempting to add another marriage of Guy Main to Gal One gives a message box saying: “Gal One-2 is already in a marriage with Guy Mail-1. Do you want to create another marriage for them anyway?” This indicates that PAF prefers separate FAM records for remarriages of the same couple.

Conclusion: PAF understands preferred birth dates and preferred marriage dates but does not designate them as such. A remarriage is better as a separate FAM because PAF may mess up multiple marriages within one FAM.

 

2. RootsMagic 4.0.9.7

Nothing mentioned in its .lst file after import.

- Treats the first birth date and first marriage date as the preferred one.
- Lists 3 spouses.
- Guy main shows all events in date order. This is confusing because there are 2 births, 2 census and 6 marriages listed. You can’t tell which is primary and which is alternate here.
- Exports events in date order. It loses the preferential birth and marriage. Bad!
- Exports lots of extra junk in its default GEDCOM. 651 lines long. Bad!

Attempting to add another marriage of Guy Main to Gal One gives a message box saying: “These two people are already linked as a couple. If they were married twice to each other you can add a second marriage fact.” and it won’t let you add it! This indicates that RootsMagic must think the alternate marriage date is in fact a separate marriage event.

Conclusion: RootsMagic indicates it prefers one couple married twice to be two events under the same couple. RootsMagic does not handle alternate marriage dates but thinks of those as separate marriage events. It really doesn’t handle preferred and alternate birth dates either.

 

3. Legacy 7.5

The import box is sloppy and displays: “Individuals:6” without a space between the colon and the 6. That is just sloppy. But no messages are displayed.

- Legacy understands preferred and alternate birth events and lists the latter as an “Alt. birth” event. Excellent!
- Census events are listed as two separate events.
- Legacy understands preferred and alternate marriage and divorce events and lists them as such. Excellent!
- Unfortunately, Legacy has the first marriage set to preferred obviously to indicate who to include in reports. Is there such a thing as a “preferred” marriage? The wording “preferred” is inappropriate here.

Attempting to add another marriage of Guy Main to Gal One gives a message box saying: “Gal One [2] is already his wife. Do you want to link her again?” but it allows you to and creates a new FAM record.

Conclusion: Legacy understands preferred and alternate events the way GEDCOM may have intended. It thus prefers additional marriages to be in separate FAM records so that marriage and divorce dates within one.

 

4. Family Tree Maker 2008 (sorry, I don’t have a newer version).

I had the GEDCOM file marked as GEDCOM 5.5.1, so FTM gave the message: “GEDCOM file must be version 5.5 or greater” and would not read it. So I changed the 5.5.1 to 5.5 and reimported the file. No errors in the log file.

- FTM as did the other programs, correctly treated the first event as primary.
- FTM show the birth and marriage and divorce events and displays which are preferred. This, like Legacy, is the way GEDCOM may have intended it.
- FTM merges the info about Guy Main’s marriages to Gal One together, indicating its understanding that this is the same person who Guy Main remarried. This is excellent as none of the other three programs denoted this.
- However, the two Census events were listed with the first one as preferred. This is bad since it doesn’t indicate an understanding that there may be more than one event of this type.
- Although FTM tried to indicate its “smarts” about knowing that it married Gal One twice, on the Relationships page, it indicates that Gal One is now “Spouse – Divorced” and Gal Two is “Spouse – Ongoing” which is wrong, since Gal One has been remarried since the divorce. If FTM noted the order of the FAMS tags, it would have got this right.
- Exported the alternate birth first and the primary birth second. Bad.
- Exported only one of the two marriage and divorce dates. Bad.
- FTM 2008 crashed once while I was picking another tab in it.

Attempting to add another marriage of Guy Main to Gal One gives a message box saying: “These people cannot be attached because one of them is already in a direct relationship of the other.” And it won’t let you do it. This is despite its seeming understanding that that Guy Main married Gal One twice. Since FTM recognizes primary marriages and forces you to put a remarriage under the same person, the other marriages will have to be non-primary and you will no longer be able to tell if they are alternatives of the same marriage, or a real new marriage. Too bad.

Conclusion: At least FTM indicates the primary event. But it does everything else wrong.

 

Overall Conclusion: This is one horrible exercise to do. Since GEDCOM isn’t clear about what to do, everyone has implemented it differently.

To me Legacy does it best. But to settle on one way that’s incompatible with everyone else is not really a good idea. The real solution may have to wait until a new standard to replace GEDCOM is created, one that will have a more rigorous way of defining preferred and alternate information for an event, and how to properly handle the various types of unions.

Below is a set of best practices as I see them. The format of this is inspired by Tamura who originated the idea.

Best Practice

GEDCOM reader

  • Multiple BIRT or DEAT events in an INDI record should be treated as a single birth or death with conflicting sets of information.
  • The order of the BIRT and DEAT events should be treated as the user’s judgemental ranking from most likely correct to least likely correct (or even incorrect).
  • All other event/fact types that occur multiple times within an INDI record should be treated as separate events/facts.
  • Multiple MARR or DIV events in a FAM record should be treated as a single marriage or divorce with conflicting sets of information.
  • The order of the MARR and DIV events should be treated as the user’s judgemental ranking from most likely correct to least likely correct (or even incorrect).
  • All other event/fact types that occur multiple times within a FAM record should be treated as separate events/facts.
  • Two or more FAM records with the same HUSB and WIFE should be treated as multiple unions of the same two people. No warning should be issued. Children of these FAMs are full siblings and should be displayed as such.

GEDCOM validator

  • Do not give any messages for allowed multiple events/facts of any type.
  • Do not give any messages for multiple FAM records with the same HUSB and WIFE.

GEDCOM writer

  • Export multiple BIRT or DEAT events in an INDI record when there is conflicting information, ordered from most likely correct to least likely.
  • Export multiple MARR or DIV events in a FAM record when there is conflicting information, ordered from most likely correct to least likely.
  • For all other events/facts, export multiple events of one type only when they are separate events/facts.
  • Create a separate FAM record for each remarriage/reunion of a couple after they have divorced or otherwise separated and then got together again. Attach the children to the appropriate marriage/union.

Future GEDCOM Standard

  • Eliminate the ambiguity caused by multiple events being allowed to represent conflicting information in some cases, but different events in other cases. Make conflicting information a substructure of the preferred event. Then multiple events will always represent different events.
  • The current FAM record is defined as a marriage/union of a man and woman becoming a parent. This definition is wrong in many ways and the implication that it represents a family is misleading. Change the structure to be a UNI (union) of two (or more) people for any purpose. Allow same-sex couples. Don’t require a child but do allow them.
  • Change the HUSB/WIFE tags to be a non-sex INDI tag. Whether INDI is a husband or wife will be able to be inferred by the sex of the person being pointed to and if the couple has a MARR event.

 

3 Comments           comments Leave a Comment

1. trolleydave (trolleydave)
United States flag
Joined: Sat, 8 Sep 2012
3 blog comments, 10 forum posts
Posted: Mon, 25 Mar 2013  Permalink

Having nothing better to do, I ran your file thru Roots Magic V. 6. Also edited version to reverse a couple of pairs of dates. One improvement - the re-export was only 255 lines long. Everything else behaved about as you described it. Where there was only one date allowed, the first occuring in the GEDCOM was used, but in lists of events, etc, all versions were listed in chrono order w/o differentiation. Only place you could tell which was which was on the edit person page where your included notes were displayed in a side box along with sources, etc. But it did keep the people and order or marriages and children intact (don’t know if that was true of the re-export, got tired of fooling with it).

FWIW

(push) (push) didn’t I hear a hint about a beta?

Still at the mouse hole …

Dave

2. trolleydave (trolleydave)
United States flag
Joined: Sat, 8 Sep 2012
3 blog comments, 10 forum posts
Posted: Mon, 25 Mar 2013  Permalink

PS I did go back and take a closer look at the exported GEDCOM, and yes it does output the multiple b. dates in chrono order. Dave

3. Louis Kessler (lkessler)
Canada flag
Joined: Sun, 9 Mar 2003
137 blog comments, 200 forum posts
Posted: Tue, 26 Mar 2013  Permalink

Dave:

Thanks for checking the latest version of RM for me. But I really wouldn’t expect that programs would make changes to handle a GEDCOM ambiguity, so however they handle it, that reaction would probably stay embedded into the program until there was an important reason to address it.

Here’s a hint: When I start making blog entries about genealogy programming, it’s usually because I’m back working hard on Behold.

Louis

 

The Following 1 Site Has Linked Here

  1. Genea-Musings : Sun, 31 Mar 2013
    Best of the Genea-Blogs - 24 to 30 March 2013 ... Louis demonstrates several problems with the current GEDCOM standard. ...

Leave a Comment

You must login to comment.

Login to participate
  
       
Register   Lost ID/password?