Login to participate
  
Register   Lost ID/password?
Louis Kessler’s Behold Blog » Blog Entry           prev Prev   Next next

DMT - The Horizon Effect - Tue, 3 Dec 2019

In Version 3 of Double Match Triangulator, I added the ability to specify the smallest segment match that DMT would consider to be part of a valid triangulation (default 7 cM) and the smallest segment match that DMT would consider to be a valid single match (default 15 cM).

A situation that can happen when you get close to the triangulation limit is something I will call the horizon effect.  If two of the three valid overlapping matches in a triangulation are over the triangulation limit (i.e. >= 7 cM), but the other is slightly under it (e.g.  <= 6.9 cM), then you’ve got a problem. DMT will eliminate the small segment and incorrectly classify the triplet, not as a triangulation, but as a Missing A-B or Missing B-C match.


Is this a Major Problem?

To be honest, I would say no.

  1. Leaving out valid triangulations only gives less data to work with but is not a problem.
  2. The misclassifying of a triangulation as Missing B-C might allow the B-C match to be used incorrectly as an inferred match.
  3. The misclassifying of a triangulation as Missing A-B would get the A-C match to map onto the incorrect parent.

But cases 2 and 3 shouldn’t be too concerning since DMT uses a consensus approach. If the majority agree it is a triangulation through a particular common ancestor, then the (hopefully) fewer misclassified matches will be outnumbered by the good ones.


A Possible Improvement

Even so, I’d like to see if I can address this horizon effect and do something to reduce the number of misclassified matches. I came up with an idea.

Currently, DMT ignores all matches in the Person A and Person B match files that are below the triangulation limit. I can change that so that Person A segment matches that are less than the limit will still be compared to Person B matches.

e.g. If we have a B-C match of 7.2 cM that overlaps with an A-C match of 6.8 cM and an A-B match of 6.7 cM, then DMT will now say that is a triangulation.

What is the extra bit on the B-C match?  Well it could be an extra bit at either end that matches by chance, or it could be that B and C are more closely related than A and C and have a larger match between them.

I know some A-C and A-B matches below the triangulation limit will then be included, but that limit is no magic number. Segments above the limit are not necessarily valid, and segments below it are not necessarily invalid. We are simply using the limit to pick the point at which we expect that most triangulations will be valid.


Can’t Always be Done

DMT 3’s inclusion of smaller A-C matches for triangulations will only work if the match data contains segments smaller than the limit selected. If the limit you select in DMT is 5 cM, but your match data does not include segments smaller than 5 cM, then DMT will not have any smaller A-C segments to work with.

In that case, the horizon effect will occur more often and DMT’s consensus approach will have to be relied upon to produce reasonably logical results.

Lower limits of individual segment matches at each company are:

  • Family Tree DNA:  1 cM
  • 23andMe:  5 cM  (on the X chromosome:  2 cM)
  • MyHeritage DNA:  6.1 cM
  • GEDmatch:  default 7 cM, but you can reduce that down as low as 1 cM

If you’re using GEDmatch, you could download just Person A’s segment matches to a slightly lower limit. e.g. if your triangulation limit is 7 cM, try downloading A’s segment match file to 5 cM.  I would not go as low as 1 cM at GEDmatch. Doing so is known to introduce too many false matches. See False Small Segment Matches at GEDmatch.

If your segment match files go down to a certain cM, e.g. 6.1 cM, then you could raise your triangulation limit in DMT a bit, say to 8 cM.

Personally, I don’t think it’s necessary to worry too much about this fine tuning. DMT should give reasonably similar results whichever way you do it. Really, you’d be much better off spending your time trying to identify common ancestors of more of your DNA relatives, as that will improve DMT’s results the most.


So How Did It Do?

I made the above changes to my working version of DMT and ran the same data that I did for my 23andMe article.

This time around, DMT included 175 A-C segment matches between 6 and 6.99 cM and 169 segments between 5 and 5.99 cM. With the 892 people I match, these extra segments increased the number of triangulations I have from 1355 to 1757, an increase of 402 triangulations. 7 cM is at the lower limit of valid triangulation size, so some of those that include segments down to 5 cM might not be valid and be by-chance matches. Picking a very conservative number out of my head and saying that only 80% of these were valid matches, then this adds about 320 new valid triangulations and about 80 false triangulations.The power of consensus again should work to use that extra data advantageously.

Final results are that 816 (up from 790) of the 892 people I match with are now assigned clusters, and grandparent mappings now cover 52.8% of my paternal side, up from 46.1%. 

The improved grandparent mapping (from DNA Painter) is:

image

Compare this to the 46.1% diagram from before, and I you’ll have a hard time finding the differences, which is good:

image


Update to DMT Coming

I think it’s worthwhile including this small improvement in an update to DMT. I’ve got a few more small fixes/improvements to make and one other idea for using the results from one company to initialize the run for another company. So hopefully within a week or two, I’ll have a new release of DMT available.

No Comments Yet

Leave a Comment

You must login to comment.

Login to participate
  
Register   Lost ID/password?