Login to participate
  
Register   Lost ID/password?

Louis Kessler's Behold Blog

Double Match Triangulator - Version 1.4 - Fri, 20 Jan 2017

DMT is a semi-finalist in the #InnovatorShowdown at #RootsTech 2017. This is a new version of the program with several improvements.

You can get the new version on my DMT page. It is freeware to help you do  autosomal DNA segment analysis.

Now Works with Older CBR files

My own FamilyTreeDNA results came in 11 days ago. When I downloaded my Chromosome Browser Results (CBR) file and ran it through DMT, it didn’t find any triangulations with anybody. That’s because my results were brand new. The other CBR files I had did not know about my results because when they were downloaded, my results weren’t in the system yet.

DMT used to check that Person a’s file had matches with Person b and Person b’s file had the equal matches with Person a. If not, DMT wouldn’t use the a-b matches. So there would be people who Double Matched, but nobody would Triangulate.

To handle this situation, Version 1.4 now only needs the a-b matches in either Person a or Person b’s file. Now you won’t need to update all your older CBR files whenever you get a new tester in your family. Of course, you’ll only Double Match with Person c people who got their results after the older of your Person a and Person b files. Eventually you may want to update your older CBR files with newer ones, especially if there’s a particular Person c missing from the analysis. But updating your files is no longer necessary.

Prevents the Same Person from being used Twice

This was annoying. If you had several CBR files for a person, downloaded on different dates, and you ran By Chromosome to combine everything, then the person would be included as Person b multiple times.

Now DMT checks the names of the Person b people. If the same name shows up, it will only use the last file when ordered by filename alphabetically, which should be the one with the latest date.

This way, you can download new CBR files and leave them with their older ones for comparison, and DMT will only use the newest in its By Chromosome runs.

Excludes non-matches from the By Chromosome Analysis

Originally, I thought it was okay to include all the Chromosome Browser Results files in the By Chromosome analysis. I thought that even if Person b does not match Person a, the Double Matches should still be meaningful.

Yes that is true, but …

This will yield to false interpretation if Person b actually does match Person a on some segments, but they are below the threshold of FamilyTreeDNA to consider them a match. The segments that were a-b matches would then incorrectly show up in DMT as Missing a-b Segments rather than as Triangulations. This is very bad because Double Match Theorem 1 would get you to conclude that this segment is on the other half of the Chromosome pair than it really it. That would make you conclude that this is a paternal match when it is really maternal, and vice versa.

So that had to change. Non-matches are excluded in the By Chromosome Analysis.

Better Handling of Duplicate Segments in CBR files

FamilyTreeDNA unfortunately downloads matches in its CBR files by match name rather than kit number. If two people have the same John Smith, or if one person tested twice under the same name, all those matches will be in the CBR file mixed together looking like one person.  DMT puts a  “##” before the name of people with this problem, so that you will be aware when you use those segment matches.

Duplicate segments will be because a person tested twice. In most cases, all the segments are duplicated (or even triplicated if someone did 3 tests). This case is easy to detect and remove all the extra entries. Then this Person c can be used without worry. DMT now fixes this for you and there is no “##” before these people’s names.

For the overlapping people, if you really need to fix one or two because they are critical in your analysis, you can go to FamilyTreeDNA’s Chromosome Browser and look that name up. You’ll see more than one person. You can download their individual matches and manually doctor up your CBR files, but you’ll have to make up a different name for the other, e.g. John Smith and John Smith2. This is messy because your CBR file for Person b will also have its John Smiths together, and your John Smith2 won’t match anyone in Person b’s file unless you fix that file as well. Ugh!  Better to wait for FamilyTreeDNA to fix this problem, if anyone knows how to let them know about it.

Improvements to the People Page

This is likely the most visible improvement. It is on the People page for individual Double Match runs, and for the By Chromosome run. The two have been made more consistent.

image

And now all segment matches use consistent notation for the largest Single Matches between Person a and Person c on each Chromosome, 1 to 22 and X (sometimes referred to as 23)

If a-c Triangulate on that Chromosome, then the largest length in cM of any a-c segment that Triangulates is prefixed by the letter "T" and is shown in green so it can be easily picked out, e.g. image

If a-c does not Triangulate on that Chromosome, but does Double Match, then the largest length in cM of any a-c segment that Double Matches is prefixed by the letter "D", e.g. image

X matches will be shown in column ACX with red text and the prefix after the letter "T" or "D" will be "X", e.g.  image or image

Also all Triangulating people are shown first, ordered highest to lowest in their total a-c cM, so the closer relatives will be listed earlier on.

————————–

I found that I needed the above changes once I downloaded my own data. I’m sure they’ll be useful to you as well if you use DMT.

It took me 6 days to make these changes. I know I worked hard to get this working over that time. So I was curious and I counted up the number of DMT runs that I had to do to implement, test and debug all this. I was able to total up the number of DMT log files that were created each day. They counts were:

Sunday, Jan 15 - 48
Monday, Jan 16 – 33
Tuesday, Jan 17 - 30
Wednesday, Jan 18 – 80
Thursday, Jan 19 - 73
Friday Jan 20 – 33

Wow! I thought I worked hard on this, but I never expected that it would have taken me 297 runs of Double Match Triangulator to get the changes in this version working.

In total I’ve got log files for 1,668 Double Match Triangulator runs dating back to my first prototype run on June 26, 2016 when I first added the log file.

GEDCOM 1 Lives! - Sat, 14 Jan 2017

I found out about this On December 29, when Martin Geldmacher of Germany requested a trial key for Behold on the Behold Download page. He wrote in the “Please let me know how you found out about Behold” box the following:

I am trying to find a way to read/convert an old Gedcom 1.0 file (created by FHS). A few hours of googling brought me to your blog posts about "prehistoric" gedcom files. While no support for 1.0 is promised, I still want to try it out if it can help me.

Well, that was definitely interesting to me. I’ve done my part in the past to resurrect ancient GEDCOMs. In August 2014, I found what I thought was the The World’s Oldest GEDCOM File? Tamura Jones confirmed for me that this was just a GEDCOM 2.0 file, and that there was still GEDCOM 1 before it. Tamura wrote an article about GEDCOM 1.0 and told that the he had a GEDCOM 1 file in his collection. It was a sample file that Phillip Brown created. Phillip Brown is the author of Family History System. He is the only programmer to have implemented the very first GEDCOM specs and an earlier version of his program, Family History System, is the only program known to have export GEDCOM 1. Later versions of FHS exported GEDCOM 2.0 and later.

I was a user of Family History System many years ago, first purchasing it in 1993. Like most genealogists, I never throw anything out, and I still had a hardcopy of the FHS user manual which had the GEDCOM 1 specs in it. I then wrote my article: From Ancient GEDCOM to Prehistoric GEDCOM, where I said:

Will I support GEDCOM 1.0 in Behold? Well I could. But I doubt if anyone has any files of that format lying around that they really need to extract the data from. Let me know if you do.

So I was very surprised by Martin’s claim of having a GEDCOM 1 file. I emailed Martin back and I said to him:

If you really have some GEDCOM 1.0 files, I’d love to see them. They are a rarity.

And if Behold doesn’t do work right for them, then I can get it to.

Martin wrote back and told me the unbelievable. He said the file was created by his father Joachim in the 1990’s. It contains about 10,000 people that included not only his family, but the whole small German town where they were from. That data was eventually compiled into a book that contains town history and genealogy information, and Martin’s ancestors go back to the year 1658. The GEDCOM file was dated 2015, which Martin believes was when it was copied from his father’s old DOS computer. Martin attached a copy of the file for me.

It took me about 10 days to implement GEDCOM 1 reading in Behold. On January 10, I quietly released version 1.2.2 of Behold. I went back to the stable version of 1.2.1 (rather than using the 1.3 development version I’m nearing completion on) and added GEDCOM 1 support to it.

GEDCOM 1 uses 2 letter tags that are not separated from their level number, e.g. “0HH” is the header record whereas “1 HEAD” is what that was changed to in GEDCOM. Handling this was relatively simple. I originally mapped the two letter tags to their 3 or 4 letter equivalent, but there were too many that didn’t match, so I changed that so Behold would recognize the tags directly, as I do for the GEDCOM 2.0 tags.

I was hoping the family structure would be similar to GEDCOM 2.0 which connected siblings youngest to oldest together rather than listing children of families as later GEDCOM does. Yes it did connect the siblings, but of course it had to be oldest to youngest. And it did so via each parent, so for any person, you have the father’s next child and the mother’s next child. You also have the children pointing to their father and mother. So I had to custom build the conversion of this to the CHIL/FAMC connections in use today. That will allow this information to be exported to GEDCOM 5.5.1 once I add GEDCOM export to Behold (coming next, after version 1.3 is released).

What actually caused me the most problem was that there were no BIRT, MARR, DIV or DEAT level 1 events. Instead there were BD (Birthdate), BP (Birthplace), MD, MP, etc. tags at level 1. They needed to be mapped to level 2 tags under their level 1 tag that I had to create. This was tricky as you have to wait to encounter the following tags before you create the earlier one. It is tough to do that efficiently in what is a sequential parser. But I found a solution that worked well enough.

So I was able to read this GEDCOM 1 input:

image

And display it in Behold like this:

image

The only new thing in Behold Version 1.2.2 is the ability to read GEDCOM 1 files. Unless your last name is Geldmacher, I doubt you’ll need to upgrade to this version.

I am amazed that the first non-example GEDCOM 1 file produced from a real genealogical research study is one so detailed and comprehensive. My congrats go to Martin’s father for such an effort. And I thank Martin for searching me out and allowing me to use his father’s file.

GEDCOM 1 Lives!

Note: Martin has given me permission to include his and his father’s name in this article. However, his GEDCOM 1 file has information about living people in it and I’ve promised him I wouldn’t share it.

What to do When Your @FamilyTreeDNA Autosomal Results Come In - Tue, 10 Jan 2017

Family Tree DNAIt’s always an exciting day when your DNA results come back. After analyzing my uncle’s results ad infinitum for the past six months, I finally bit the bullet and sent away for my own test. The FamilyTreeDNA Holiday Season sale with coupons for extra dollars off was quite motivating.

I ordered my own test on November 14. It arrived on Nov 24 and I did the cheek swabs the next day and mailed it. Yesterday, I got the email that said my results were in.

So these are the Steps I recommend that you do when your autosomal results come in:

Step 1. Don’t Panic! Yes, you can be excited, but take your time and do it right.

Step 2.   Sign in online to your myFTDNA account. If you haven’t already, do some administrative stuff first. Once your results are there, people are going to find you and you won’t want to look like literally a nobody. Add an “About Me” and image to your Profile.

image

Step 3.   Next to the Account Settings tab is the Genealogy tab. Enter your Most Distant Ancestors:

image

Step 4. Also on the Genealogy tab, enter your Ancestral Surnames:

image

Step 5. Now go to Family Tree and enter your parents, grandparents, great-grandparents, etc.  No need to enter all the details, but make sure you have all your ancestors names, birth years and death years. And I wouldn’t yet enter any other brothers, sisters or children/grandchildren of your ancestors. That gungs things up a bit. There is a better way to add them. See step 11 below.

image

There. You’re all set up. People can find you and they’ll be able to see if they might share family and by providing this information, you’ve invited them to contact you.

Step 6. Now check out your Family Finder results. Go to the matches page and, if you’re like me and are from an endogamous population, prepare to be overwhelmed.

image

You’ll see you have (A) 8,392 people who match you, and it will take (B) 280 pages to display them. Then (C) check the first few at the top and see if there is any one you know who is a relative.

Step 7.   Either breathe a sigh of relief when you see your uncle shares 1,861 cM as he should, or agonize a bit … maybe a lot .. as you search desperately for your uncle in the list. You know he tested. If he’s not there, then go back to step 1, except now you Panic!

Step 8. You don’t want to go through the Family Finder matches one-by-one. There’s a better way. Download your matches. At the bottom of the Family Finder matches page, there are two buttons:  CSV or Excel

SNAGHTML14a70e63

CSV is a comma delimited text format. Excel isn’t really Excel but is XML. It doesn’t matter which one you use. They both contain the same information. And Excel or any spreadsheet software can read them both. I usually use CSV, so hit the CSV button and download the file.  It will be called something like: nnnnnn_Family_Finders_Matches_yyyymmdd.csv.

Step 9. Open that file in Excel or your spreadsheet software. Right away save the file with a meaningful name and in the format of your spreadsheet software (e.g. Excel is .xlsx) so that all the spreadsheet features will be retained. I named my file “LK Master Match List.xlsx”. Your file should look like this:

image

… and yes, they are scrunched just about that much.

The key thing here is that you’ve got all your match data and it is much easier to sort or search or highlight or add your notes in a simple spreadsheet than trying to do so on the FamilyTreeDNA site.

The columns include:
- Full Name
- First Name  (You can delete this, as it’s superfluous)
- Middle Name (This too)
- Last Name (You’ll need this to sort by last name)
- Match Date
- Relationship Range
- Suggested Relationship (I’d hide this. Relationship Range is good enough)
- Shared cM
- Longest Block (Change title to: Max cM)
- Linked Relationship
- Email (I’d move this after the Name)
- Ancestral Surnames
- Y-DNA Haplogroup
- mtDNA Haplogroup
- Notes
- Matching Bucket

The one field that FamilyTreeDNA does not include that I feel is important, that 23andMe includes in their match download file, is the person’s Sex. Oh well, we won’t worry about that for now.

Step 10.   Now go through all your matches in your list of 8,392 and make a note of anyone who is already a known relative of yours. Other than my uncle, I had a 3rd cousin Joel. He and I have been sharing our research on our common line for about 15 years. So I’ve got my Uncle, whose test I submitted, and Joel. That’s it. 2 out of 8,392. Maybe you have more.

Step 11.   It’s time to go back to your Family Tree at FamilyTreeDNA and add your known relatives. For my uncle, I would add him as a son to my grandfather. For my third cousin Joel, I would find our common great-great grandparents, add a son, add a daughter to the son, add a son to the granddaughter, and add Joel under the great-grandson. Don’t bother naming the people in-between.

image

Doing this will not only let you get a picture of where your relatives are in your DNA tree, but it will also get the FamilyTreeDNA site to start calculating something pretty cool for you. It’s going to use the DNA matches of your relatives to figure out for all your 8.392 people, who matches you on your father’s side, who matches you on your mother’s side, who on both sides, and who matches but the side isn’t determined yet. You’ll then see this on your matches page:

image

Both my uncle and Joel are on my father’s side, so it has figured out that 2,168 of my 8,392 matches also match to either my uncle or Joel. And now those little symbols for paternalimage, maternal image, or both image, will be shown alongside each of your matches to which they apply.

Step 12.   With that bit of extra information, download your Family Finder matches again (see Step 8). This time, that last field called “Matching Bucket” will be loaded with Paternal or Maternal or Both or N/A.

Step 13.   Now go at it and have fun. Look in your spreadsheet for people you know, your ancestral surnames, anything that looks interesting. Use the Notes column for your thoughts on this match. Color code people that are of interest to you. Add other columns as you see fit. If you don’t end up with something pretty like this, then you’re doing it wrong:

image

Step 14.   If you made it all the way here, that’s super!

Now you should be daring. Contact the DNA relatives you don’t know who you might be able to figure out a relationship to, and your DNA adventure will begin. You’ll be consumed by it.