Login to participate
Register   Lost ID/password?
Louis Kessler's Behold Blog » Blog Entry           prev Prev   Next next

CONC Me On The Head - Sun, 10 Jan 2010

Ah, the CONC tag. It stands for Concatenate the line. It is put in a GEDCOM file to split a long line. For example:

2 NOTE This is a note that is sp
3 CONC lit on two lines.

The note will be put together and the word “split” will be reformed as one word. That is the way it is clearly defined in GEDCOM. There will always be a word split in the middle, with the first part of it on the previous line and the rest of the word being completed on the CONC line.

Many programs follow that standard, but there are a lot of programs that missed that little nuance in GEDCOM and decided to implement their GEDCOM output in what would seem straightforward:

2 NOTE This is a note that is split
3 CONC on two lines.

Here the split occurs after the word. What this means for a program like Behold is that it now has to add a space before concatenating the two lines. In the previous example no space was added.

Unfortunately there is nothing in the GEDCOM file telling you which method the program used. I could write a procedure to scan the file and “guess” what it is or some other form of artificial intelligence. I might be able to make that 98% accurate, but never 100%. So something is needed to change this if the assumed method is wrong.

What I have done in Behold is made a list of programs that use the correct CONC that splits words and does not require a space added. I assume all others use the “bad” CONC that does not split words but requires the space.

My current list of programs that outputs CONC correctly to GEDCOM is not very long. It includes PAF, Brother’s Keeper, Legacy, The Master Genealogist (TMG), RootsMagic and only some versions of Family Tree Maker. I haven’t rigorously gone looking for them. If you know of any others, I’d be happy to find out about them and I’ll add them into Behold.

That’s all wonderful. But then I found out that The Master Genealogist added a user option to allow them to output CONC tags the incorrect way, presumably so that the file can be read by programs that don’t understand the “correct” way. So now, I can’t even rely on a specific program always doing it the same way. Boo to TMG for that.

What I did was on my Organize GEDCOMs page, I now have a CONC value for each GEDCOM. It specifies how many spaces to add after concatenating lines, either 0 or 1. It will be automatically be set to what I’ve assumed is its program’s default. But now you can change that value and save it in with your Behold file so it can be remembered. The next beta release will include this.

So if in your notes as displayed by Behold, you see spaces where they shouldn’t be or two words put together where they shouldn’t be, it may either be that the CONC setting needs changing, or maybe the GEDCOM you’re looking at simply has tons of typos.

Addenum: May 29, 2012: The maximum line length allowed in GEDCOM right back to version 5.3 is 255 characters. For some reason almost all programs chose to split their GEDCOM lines prior to them exceeding 80 characters (what the old punched card limit was). I’m not sure why that is. Maybe an earlier version of GEDCOM (5.0 or 4.x) had an 80 character limit.

No Comments Yet

Leave a Comment

You must login to comment.

Login to participate
Register   Lost ID/password?