Login to participate
Register   Lost ID/password?

Louis Kessler's Behold Blog

Help Needed for DMT - Thank You! - EAST Part 3 - Sat, 2 Jul 2016

I have the basics of my Double Match Triangulation program working, but before I can release it to the world (as freeware!), I must put it through its paces and test it with some real data and ensure that it will correctly analyze and display the data and relationships.

Since I’ve only DNA tested my 93 year old uncle Harry, and since two people’s Chromosome match files are needed for the program to work on, I cannot do this by myself. So I contacted several of the people listed as matches on my uncle’s Family Finder page at FamilyTreeDNA to see if they would help out with my research.

I was overwhelmed by the enthusiastic response. Everyone, myself included, is looking to find some way to make some sense out of their autosomal matches, and then there’s the potential promise that true triangulation made easy by my DMT program could save loads of time and help us figure out how some of our matches are related.

My uncle’s match list (which is growing daily as FamilyTreeDNA finds new matches) is currently up to 7,865 matches and still only has one confirmed relative.

The one confirmed relative is Joel, who is my 3rd cousin, and my uncle’s 2nd cousin once removed on my uncle (father’s brother’s) mother’s father’s side. Joel is 3rd out of 7,865 on my uncle’s match list with 134.8 cM shared. Joel and I have been communicating for years working with several other cousins on that common side of our families. Joel sent me his chromosome match file.

Then I found Seth, whose ancestral surname was Braunstein (the same as my uncle), whose family originated in a town in Romania less than 100 km from where my uncle’s Braunstein ancestors came from. He didn’t show up prominently in my uncle’s matches until FamilyTreeDNA’s recent algorithm update. Seth moved up from a 5th to remote cousin sharing 127.1 cM to a 2nd to 3rd cousin sharing 130.7 cM. I’m very hopeful we’ll find the connection between Seth and my uncle because we know it will be on both our paternal lines. Seth sent me his chromosome match file.

Another person high up on my uncle’s match list was Erika, listed as a 2nd to 3rd cousin at 160.0 cM. She caught my attention when I was putting all my Pikholz connections together in preparation for my day at the Ontario Genealogical Society Conference where Israel Pickholtz was going to speak. About half the people in Israel’s book: Endogamy, One Family, One People were on my Uncle’s match list. But Erika was the closest of anyone with a Pickholtz connection. I contacted George, Erika’s cousin who was administering her account and is himself listed as a 3rd to 5th cousin at 92.2 cM shared. George sent me both his and Erika’s chromosome match files.

Then there’s a FamilyTreeDNA project for an area of Ukraine that i joined on behalf of my Uncle. Four people from there, Sandy, Barbara (2nd-4th cousin, 102.5 cM), Bruce (2nd – 4th cousin, 97.0 cM) and Mark who have connections to my uncle, sent me their chromosome match files. Barbara and Bruce also each sent me two more of the files they administer. Sandy sent me 23 files in total covering quite a few relatives in her family, many of which are among my uncle’s matches. Sandy has considerable experience in triangulation and has given talks on her analysis using it. I look forward to working with Sandy to help figure out her/our families.

Last but not least is Arnold Chamove who has been a Behold user for almost a year. He and I have had many good talks since then about Behold and what it does and should do. So it was a bit surprising when I found 6 people whose DNA Arnold administers listed in my Uncle’s match lists, the closest of whom is his cousin Roger (2nd – 3rd cousin, 144.2 cM). Arnold has given me access to 23 of the chromosome match files that he administers. It will be fun helping Arnold put his families together and finding out what our connection is.

It is very interesting that I can’t offhand connect yet to any of these 2nd to 4th cousins except for Joel. Most Ashkenazi lines only go back about 5 generations, and due to endogamy, 2nd to 4th cousins can mean 3rd to 6th cousins, even though FamilyTreeDNA says they try to correct for this.

I’ll be taking these 58 Chromosome Match files and use them for testing and to determine how best to analyze, interpret and present the triangulation data.

Most of the chromosome match files are from full Ashkenazi heritage with all its endogamy. These files range from 8 MB to 14 MB in size and the largest have more than 200,000 chromosome segment matches to 8,000 people. Non-Jewish chromosome files I’ve been sent seem to be about one tenth that size.

9 x 8 = 72 combinations both waysAnd the DMT program does Double Match Triangulation, meaning it needs two match files for a comparison. I will do every pair of comparisons. That will be 58 times 57 or 3,306 comparisons both ways. The program takes about 5 seconds per comparison (comparing two files of 200,000 lines each), so once I get the automated selector working, I’ll let it run for several hours to do them all.

There was one person I asked who would not give me his chromosome match file. It wasn’t that he wanted to keep his information private. Au contraire, Meir is a world expert at Y-DNA research, specializing in the Levite line, and he receives hundreds of DNA files from people willing to help. I know he’d be more than willing to help me.

But Meir’s reason was very interesting. He said to me:

The “autosomal soup” is not science. far from it.
It is pseudo science on verge of charlatanism.
Leave me out of this fiasco.

I pressed him further on this, and he told me he’d do an exception for me if I could meet a challenge. Putting aside the known relations, If I could show him how a mere 7 unknowns out of the 7,600 are related to Harry, he’d be willing to participate. So sort of like Sodom and Gomorrah which needed 10, I’ve got to find 7 good people who I can match to. 

I told Meir this is a fair offer. I said I don’t know if I will succeed in identifying 7 relationship paths just using the triangulation information, but I shall try. The rewards of succeeding are just too great to ignore.

It’s going to be fun!

Obviously, 3,306 pairs of test files is enough for me for now. But if you check your FamilyTreeDNA matches and notice that Harry Braunstein is listed as one of your matches, contact me, and I’ll try to include you in my tests. 

Extreme Autosomal Segment Triangulation (EAST) - Part 1
EAST Part 2 - Double Match Triangulation

FamilyTreeDNA’s Chromosome Match File - Sat, 25 Jun 2016

My DMT (Double Match Triangulation) program that will make use of FamilyTreeDNA chromosome segment match data is nearing completion. As I was looking through some of the results it was producing, I got an unexpected surprise.

I found there were some people my uncle Harry matched to in his Chromosome Browser match file that were not among his Family Finder matches. I found out that:

The FamilyTreeDNA Family Finder information is different from their
Chromosome Match information

To make sure, I re-downloaded both the Family Finder matches and the Chromosome Match file for my uncle. My uncle had 7,777 people listed as matches in his Family Finder Match download (FFMD). In his Chromosome Match download (CMD), my uncle had 176,436 segment matches which were from 8,017 people.

There were 26 people whose names were listed twice in the FFMD. Many of them were two different people with the same name, but a few were the same people with two different tests done. But they were merged together in the CMD and their matches were combined into 26 single people. This is a mistake by FamilyTreeDNA that they should fix. Since my program uses the Chromosome Match download, the two kits will be treated as one for matching until this is fixed. (See my post: Misleading Double Entries in FamilyTreeDNA Data, which gives more information about this problem)

Also, the FFMD file downloads Unicode characters correctly, but the CMD file does not. So the CMD file does not display names that are written with accents or in a different script correctly and sometimes does not include the person at all. FamilyTreeDNA should fix this as well. There are 31 people in the CMD whose names do not display properly or who are not included in the FFMD file.

Then the people’s names in the CMD file have two spaces between their first name and last name. There should be only one space between their names, as in the FFMD file. Fix please.

The bigger question is why does the FFMD has 7,777 people versus the 8,017 in the CMD? That’s 240 people who are in the Chromosome Match file that are not listed in the Family Finder. These people all have significant cM matches. I don’t know why they are in the Chromosome Match file but don’t show up in Family Finder. My suspicion is that there is some criteria that is filtering them out of the Family Finder matches. I don’t know what that is. Maybe someone from FamilyTreeDNA should explain. Whatever the reason, the people listed in the Family Finder as matches should be the same as the people listed in the Chromosome Browser download, and FamilyTreeDNA should fix this.

Next problem. For the 7,777 matching people, the total cM for 660 of them in the CMD was at least 5 cM higher than what was given as the Shared cM in the FFMD.

So I picked a person at random who was among these 660. In FamilyFinder, my uncle has this match to Carole who shows up as having 47.70 Shared cM with Harry.


I then went to the Chromosome Browser, found Carole, and did the Download to Excel (CSV Format) that is the 1st Optional View at the top of the page. It gave the following:


Ah, I see. The difference is that the Family Finder shared cM don’t include the X chromosome cM. This is okay. Since the X chromosome is included in the Chromosome Match Download file, my program will be able to find and triangulate X matches as well.

So there are a few glitches in FamilyTreeDNA’s creation of the Chromosome Match Download file.  I have the above concerns (shown in red). If they get addressed by FamilyTreeDNA, it would help people like myself who want to make use of the file. If anyone reading this knows any of the technical people at FamilyTreeDNA, please let them know about this blog post. They can contact me if they want more explanation.

None-the-less, those aren’t show-stopping problems. My DMT program should still do a pretty good job analyzing the Chromosome Match Download file, despite its minor flaws.

EAST Part 2 - Double Match Triangulation - Tue, 14 Jun 2016

In Part 1, I gave you a flavour of the mass-triangulation that I am doing, which I called EAST: Extreme Autosomal Segment Triangulation. Triangulation is a technique to determine what parts of your DNA come from what ancestors. That will help you determine how your matches are related to you.

What is new about this EAST technique is that it uses segment matches of two people (your own and a relative) rather than just one (your own).


Single Match Triangulation (SMT)

To make this clear, let me first describe the standard way people currently triangulate with FamilyTreeDNA data. I’m going with the same example I used in Part 1, using my uncle Harry as person A, and my 3rd cousin Joel as person B. They are 2nd cousin’s once removed and their most recent common ancestors are my great-great-grandparents Hirsch Focsaner and his wife Dwora.

To triangulate, I first need the segments that match between Harry and Joel. I go to Harry’s FamilyTreeDNA account, select Chromosome Bowser, and pick Joel to match to. It gives this diagram:

Harry and Joel match on the orange segments.

Now we need to find a relative of both Harry and Joel to be the third person of the triangulation. We go back to the FamilyTreeDNA matches page and next to Joel’s name, we click on the 4th cute little symbol below his name to “Run Common Matches”.


That brings up a second menu and we select “In Common With”. Then, if you are lucky like me, you’ll be presented with 232 pages of matches containing the 2,318 people who match to both both Harry and Joel.

Now write down the names of the top 4 and go back to the Chromosome browser and add them along with Joel. Now you’ll see:


These are still Harry’s chromosomes. Joel’s matches with Harry are shown in orange. We want a third person who matches with Harry and Joel. These four people only have one instance, in chromosome 1, where one of the others matches one of Joel’s segments. You can see it in chromosome 1 as the green line that is under the orange line. Joel’s match with Harry (the orange line) is 10.64 cM. The green match is with someone named Daniel and it is 18.76 cM. So we have a triangulation. Harry matches Joel where Joel matches Daniel and Daniel match Harry.

The chromosome match setting was for a minimum 5+ cM. You could go down to the 1+ cM and you’ll find a lot more matches. But there’s a problem with this. Because of the way DNA analysis companies determine matches (that half-identical thing), there is a very good chance with small matches that they are not Identical by Descent and you don’t want that. i.e. you need them to be a true relation.

So you’ll have to stick to those 5+ cM matches to be safe.

But in the above we did find that one triangulation we can use. That third person has a segment in common with Harry and Joel. This indicates that the third person has a common ancestor with Harry and Joel. It could be Hirsch and Dwora or it could be an ancestor of Hirsch and Dwora.

So now I invite you to continue to do this for the other 2,314 common matches of Harry and Joel. You’ll tire quickly!

Doing this allows you to create Triangulation Groups, building them up person by person. Triangulation Groups put likely-related people together. The analysis of triangulation groups is complicated and has been written up elsewhere. Jim Bartlett describes it very well on his wonderful Segmentology blog, but I’m not going to get into it, because this only uses segment matches of a single person. I’m going to be doing it differently using the segment matches of two people.


Double Match Triangulation (DMT)

Just to let you know, the terms Single Match and Double Match triangulation (SMT and DMT) as well as EAST (Extreme Autosomal Segment Triangulation) are my own. I invented them so that I can talk about them. As far as I can tell, I don’t believe anyone else has extended regular triangulation this way. The closest thing I’ve seen so far is Roberta Estes’ article Just One Cousin, which used chromosome matches between three people. But I want to go from three to extreme. So let’s get into it.

The reason why SMT is referred to as “Single” match, is because only the segment matches of one person is used. Only Harry’s matches in the example above are used. Although we found the people who matched to Joel, we did not use Joel’s segment matches.

To do the Double Match Triangulation, I emailed Joel and he sent me his match list. Please see Part 1 where I describe what this file is and how to get it. I merge my uncle’s chromosome match list with Joel’s match list and I put it into Excel and add some fancy coloured mapping of the chromosomes.

Doing this for the same segment 1 region used in the above SMT example gives the following (which is the same picture I showed in Part 1):


The line in yellow is the chromosome 1 match of Harry with Joel. The green area with X’s on the yellow line is their match segment. Remember that second picture of FamilyTreeDNA’s chromosome browser from above? Look again at Chromosome 1:


The short orange line is my line in yellow. The longer green line is the is the line that is exactly 6 lines below my line in yellow belonging to Daniel.  The part of that line shown in green with X’s is Daniel’s match with both Harry and Joel. The two parts on either end shown in red with a’s is Daniel’s match with Harry (but Joel doesn’t match). On other segments you can see the line in red with b’s. Those are places the third party matches to Joel but not to Harry.

What’s great about this Extreme triangulation technique is that:

  1. It picks out everybody who has matching segments to you AND to a selected second person. That gives all three connections needed of the triangulation triangle for everyone in a block with one of those yellow lines. This really increases the odds of the three of you being Identical By Descent (IBD). Jim Bartlett says he’s fairly confident that triangulation works down to 5 cM. Jim also says “shared segments below 5 cM are uncharted territory for triangulation.” And he was talking about Single Match Triangulation. New research about Double Match Triangulation by Michael Maglio indicates that a false positive is statistically improbable, indicating the match is IBD (or maybe IBP – identical by population, which is still IBD, but too many generations back to be of much use). So Double Match Triangulation can be used even for small segments.
  2. You get to see, not just all the third party segments matching to you, but also the third party matching to your second person that don’t match to you. This is additional information you don’t get from normal SMT triangulation that I’ll soon show is very useful.
  3. You only have 1/16 of your great-great-grandfather’s segments. But your 3rd cousin has another 1/16. With DMT, you’ve doubled the segments you can match with.
  4. I suspect all three connections may not be necessary. You and your cousin will only match on 1/16 of each others segments. So if you find what looks like a big Triangulation Block of known cousins, and you match to them, and your cousin matches to them, that may be good enough. I’ll have to test this, and if it works, it will make this technique another order of magnitude more powerful in classifying your matches.
  5. Huge time savings for analysis. One EAST is a Triangulation with every single one of your matches at once. And that’s just using one selected known relative as the second person. You can use others as well. You don’t even have to use known relatives. EAST should show you if the second person is significant within your matches..
  6. Lots more that I haven’t even worked out yet.

What we haven’t done yet is to use the EAST data to analyze and classify the segments of your matched people, to put them into Triangulation Groups and identify common ancestors and where everyone fits in. That will be in the next post of this series.


… One last thing:

Triple and Multiple Match Triangulation (TMT and MMT)

I want to define these now, because I see it is possible. Get the segment matches of 3 or more relatives and put them all in the same file together. Process them the same way as described in Part 1.

I don’t know if early on in the study of what EAST can do, getting into this complication is worthwhile. It will visually be hard to interpret because instead of having 3 colours (green for both match, blue for only A matches, red for only B matches), with triple match you’ll need 7 colours and Quadruple Match would need 15 colors.

It might be better to do a DMT three times (each of the three in a TMT paired three times) as each DMT would be easier to interpret than the one TMT.

But I’m getting way ahead of myself. Classifying segments will be next.

Follow-up June 20:  Yesterday, A Triangulation Intervention was posted by Blaine Bettinger on his blog, explaining what is correct triangulation for autosomal analysis. He says:

The only way to perform true triangulation is to have segment data and a way to confirm that an overlapping segment is actually shared by two or more genetic matches.

He says the only place true triangulation tool available is the Tier 1 Triangulation tool at GEDmatch. And he says:

It is very important to note that tools like KWorks, JWorks, and ADSA at DNAGedcom, and Matching Segment Search at GEDmatch, while incredibly powerful and valuable tools, do NOT perform triangulation.

I wanted to mention this, because it’s important to understand that the tools and techniques I am developing here with EAST and DMT are all true triangulation techniques. They work with the matching segments of two people and triangulate them, not just with one or two “third people”, but with all the third people at once.

p.s. I’m building a utility program to do this EAST with DMT automatically. I expect I’ll be able to get it to classify your matches for you into true triangulation groups. It will also create comma delimited files you can import into a spreadsheet to visualize your three-way matches like I do in my Excel examples above. When the program is ready, I plan to make it available as freeware.

Help Needed for DMT - Thank You! - EAST Part 3