Login to participate
Register   Lost ID/password?

Louis Kessler's Behold Blog

Double Match Triangulator 1.1.99 - Tue, 18 Oct 2016

I’ve released a beta of what will be version 1.2 of DMT.

You can get it here:  www.beholdgenealogy.com/DMT-1-99-1-setup.exe

It has all the new functionality. I just haven’t updated the help file yet. Once the help file is updated, I’ll re-release it as 1.2.

With the deadline for the RootsTech Innovator Summit coming up, I wanted to get some major enhancements in. There are a number of exciting changes.


1. Double Match and Triangulation Groups

DMT now will group the matching segments together based on the crossovers between them. Each group will be delineated by a thick box around them.


The idea came from the two new posts by Jim Bartlett on his Segmentology blog. Understanding and Using TGs followed the next day by The Attributes of a TG. In the latter article in section A6, Jim explains how he identifies crossover points and the 6 steps to to find the end location of his Triangulation Groups.

I thought about this, and worked to implement something similar. Believe me when I say I must have changed the procedure a dozen times from my original idea before I got it working. I had to translate what Jim was saying to the data structures of my program.

Along the way, I learned a lot, including the following:

  1. Jim works with single matches that he is manually triangulating. That’s an admirable mass of work and he’s successfully mapped the majority of his segments. He talks about the ends being fuzzy. That is true for single matches because there are often some random matches at the ends. Although I have no proof, my observations indicate that the double match ends seem to be mostly precise. The random ends just don’t seem to be there. Crossover base addresses look very precise. All the single matches with fuzzy ends are excluded.
  2. I realized from Jim’s work is that the goal is to make a map of all the ancestors of Person a. You need the b people to do the double matching, but the goal is to map out a single person. Therefore, it is only the green X’s (double matches) and red a’s that should be used to define the Triangulation Groups.
  3. I needed an algorithmic way to determine the start and end of each Triangulation Group. My DMT program couldn’t use “judgement” because there’s “no hard rule”. So I developed a rule, the rule being: If the next segment starts at or after the end of the previous segment, then end the last Triangulation Group and start the next Triangulation Group.
  4. To make that rule work, I had to sort the segments first by the lowest starting base address, and then by the highest ending base address. You can see in the diagram above, that the triangulation groups get smaller as you go down the page, and then when the start point changes, they start again. Larger TGs are closer ancestors that are made up of smaller TG’s that are segments from ancestors of those ancestors.
  5. I have learned that Double Match segments that don’t Triangulate are as valuable as Double Match segments that do Triangulate. I call them Missing a-b Matches and I wrote about them in an earlier blog post. Triangulated segments belong to a common ancestor. Missing a-b matches may match from both halves of an ancestor’s chromosome, or may match from the same segment address of two ancestral parents (a father and mother). The Double match means that Person a matches Person c and Person b matches Person c. The likelihood of two matches to the same person, both by chance is severely reduced, especially since these are people in common match lists. I believe Jim Bartlett’s suggestion that Triangulated segments are mostly IBD down to 5 cM likely also applies to Double Match groups without the a-b match.

Once I got this going, I saw that I could improve the information shown in the Map file. In the above diagram, each row is one double match. The columns are:

  • Person a – the main person you are matching to
  • Person b – the person you are double matching with
  • Person c – the person who double matches both a and b on the segment
  • Chr – the chromosome number. 23 = the X chromosome.
  • Start-AC – the base address where the a-c match starts
  • End-AC – the base address where the a-c match ends
  • CM-AC – the distance the a-c match covers in cM
  • Start-BC – the base address where the b-c match starts
  • End-BC – the base address where the b-c match ends
  • CM-BC – the distance the b-c match covers in cM
  • DMG-START – the base address where the Double Match Group starts
  • DMG-END – the base address where the Double Match Group ends
  • DMGROUP – Jim Bartlett’s name for the Double Match Group
  • STATUS – tells you if that segment triangulates

Once again, Double Match Groups (DMG) are segments where a number of people have double matches and indicate common ancestors. If a Double Match Group also has the a-b match, then it s also a Triangulation Group (TG) and indicates a common ancestral segment on the same half of the chromosome.


2. Analysis By Chromosome

So that was the major improvement for version 1.2 of DMT. But that’s not all. I’ve added another checkbox to the main window:


If you check the Analysis By Chromosome box, then DMT will combine every individual run between Person a and all the b people that are in Folder b. It will then place the results into 23 files, one for each Chromosome.

The main reason why the chromosomes are split up is because the files can become very large. In my case, I have the Chromosome Browser Download files from 61 related people that I combine with my uncle’s. It takes DMT about 5 minutes to process all the files and produce the 23 chromosome map files. It’s an amazing amount of information that I believe may unlock all the secrets we’re looking for … if we can figure out how to do so. I’ll be trying. And if you figure something out, let me know, and maybe I can program it in for everyone to use.

For example, I’m sure there is enough information in the crossover base addresses for DMT to split the Double Match Groups into the two sides: maternal and paternal. That will be something I’ll be looking at to figure out how to do for version 1.3 and I’ll likely call it Double Match Phasing.


3. Improved People Page/File

Along with the 23 map files the Analysis By Chromosome produces, a combined People file is also produced. It looks like this:


This file will help you locate the relevant segments for the people you are interested in, and determine which chromosome file they’ll be in. All of Person a’s Person c matches are listed by highest total cM. The maximum segment length of each a-c match for each Chromosome is shown. That is followed by one column for every Person b so you can determine all the people Person c double matches to.


Like I say, I’m still trying to figure all this out myself. But feel free to try it out.

If you notice any problems or have any ideas, please let me know.


Note: Version 1,1.99.1 corrects a problem where not all Triangulated segments were identified. If you downloaded 1.1.99 between October 18 and October 20, please download from the link at the top of this post).

DMT Entered in Innovator Showdown @RootsTechConf - Tue, 27 Sep 2016

Lots has happened in the past few weeks. I’ve booked my flights, hotel and registered for RootsTech 2017 in Salt Lake City from Feb 8 to 11. I was there in 2012 and 2014 so its been a while and I’m looking forward to renewing some acquaintances and meeting other genealogy software developers, geneabloggers and twitterealogists in person for the first time.

RootsTech Innovator Showdown

My speaker topic on “Using NoSQL Databases in Genealogy Software” for the Innovator Summit day was turned down, but I’ve parlayed that into an entry of my Double Match Triangulator (DMT) program into the Innovator Showdown competition. I don’t recall them having a DNA analysis program entered before so I’ll be interested to see what they think of it, and I’ll be very happy if it’s selected as one of the 10 semi-finalists. I’ve already prepared my DMT page at the Devpost site for the contest and I have to add a 60 to 90 second video about it within the next two months. I haven’t created a video for any of my products yet, so that should be fun.

If you’re planning to go to RootsTech this year, book early. You’ll get a discount off the registration fee. Hotels fill up quickly, as do the labs. Much of the session schedule has been posted and I’ve already noted many of the talks I expect I’ll attend, as well as booked myself for two labs, one with John Woodbury on Chromosome Mapping, and the other by TapGenes on using their health site, which won the Innovator Showdown last year. I’ve also booked the MyHeritage sponsored lunch for Thursday. I’m hoping to join other Geneabloggers when they meet up, and I understand Jill Ball is having a Commonwealth get-together which I’d love to attend (as a Canada representative) and see many of my Unlock the Past cruise friends again.

DMT was in a pretty good state, and I felt good about entering it the way it was. But only a couple of weeks ago, Jim Bartlett, after several months of inactivity, started up his great segment-ology blog again. His latest post on The Attributes of a TG outlined in detail his steps to determine Triangulation Groups. I read that and I immediately thought: DMT should be able to do that! So, I’m now working to finish up a new version 1.2 of DMT which will do this grunt work for you. It should really add to the usefulness of the program which will help for the Showdown.

If you plan to be at RootsTech 2017, let me know and lets see if we can get together. I’ll be there from Tuesday night through Friday, but I have to leave Saturday morning. And if DMT makes it to the top 5, be sure to talk it up for me and vote for it in the final on Friday.

Triangulation and Missing a-b Segments - Tue, 30 Aug 2016

First to reassure you, I am back working towards finishing Behold Version 1.3.

But I do have to put up this post before I forget about it. Two days ago, I  announced my free Double Match Triangulator program on the International Society of Genetic Genealogy (ISOGG)’s Facebook page. It is a closed group, so I doubt if that announcement has public access, but this is what I said:


The ISOGG Facebook group has 11,138 members, and within 24 hours there were 132 reactions and 34 comments. A great response. Many genetic genealogists downloaded the program and I got a lot of feedback.

But it presents the data in such a new way, with single matches and double matches and triangulations, that there was a need to explain what was going on.

So there are two concepts that I brought up there that I have to mention.

1. Triangulation does NOT guarantee Identical by Descent

Full triangulation is where Person a matches Person c, Person b matches Person c, AND Person a matches Person b and they all match on the SAME segment. When they do, that segment is said to be triangulated, and all other people who also double match (which means a matches c and b matches c) on the segment will also triangulate and will form what is called a Triangulation Group (TG).

Some of the thinking out there was that triangulation guarantees that the segment is Identical By Descent (IBD) meaning all three people get that segment from the same ancestor, like this:

Full triangulation

Everyone have pairs of chromosomes, which are made from one of your father’s pair, and one of your mother’s pair.  In the diagrams I’ve included, we’ll talk about a segment on a particular chromosome. H1 is one half from one of the ancestor’s parents, and H2 is the other half from the ancestor’s other parent. The ancestor passes down either H1 or H2 to each child. The child’s other half is from their other parent and we can ignore that for now.

In the case shown above, segment H1 was passed down to all three persons. This is Identical By Descent. The three people triangulate on this segment as they all have the same H1 segment. Here, triangulation identifies IBD.

But this is not always the case.

Full triangulation with a chance match

It is possible for 3 people to triangulate when one person has a chance match to the other two.

Let’s say Person a and Person b have a half match H1 on a shared segment that was passed down to them from a common ancestor. That segment is Identical By Descent. Person c could still match Person a’s H1 segment, and also Person b’s H1 segment by chance, which is quite possible for small segments. This will still be a true triangulation, but not IBD.

Kapoweee! IBD for small segment triangulations is blown out of the water.

But it’s not all that bad. Jim Bartlett is fairly confident that triangulation works down to 5 cM. And although some smaller segments will be IBS (by chance), some smaller segments will still turn out to be IBD.

The reason why a smaller criteria like 5 cM can be used is that if Person a and Person b have an IBD segment, then Person c can match by chance, but would need to match my chance ONLY to their H1 segment. They cannot crisscross between the H1 and H2 segments because Person a and Person b’s H2 segments are different.

So I wouldn’t throw them all away. Other information such as multiple people matching and coincident crossover points and single matches adjacent to the triangulated regions may help to identify which small segments are likely IBD – but that is future research.

In the Double Match Triangulator output, multiple people triangulating together makes a Triangulation Group and the relevant parts of the segments are shown with green X’s. Each row represents one person (Person c) who triangulates with Persons a and b. Some of these will be IBD, some will be IBS (by chance). The pink a’s and blue b’s show single matches adjacent to the double match area.

A Triangulation Group as mapped by DMT


2. Missing a-b Double Match Segments are Useful

Whoa! What the heck are Missing a-b Double Match segments? Well, they are double match segments (a matches c and b matches c) where a and b don’t match on the segment, so they don’t triangulate.

This is actually an entirely new concept in autosomal DNA segment analysis. Until I released the Double Match Triangulator program, there were no other tools that produced missing a-b information, so it’s something new that can be used to possibly help you to identify relationships.

I scratched my head for quite a while wondering how the heck a can match c and b can match c with both of those being Identical By Descent matches, without a matching b. It just didn’t make sense that that was possible.

I finally figured it out. I came up with two illustrative cases, and maybe there’s more that I didn’t come out with, but these two will do for now.

Let’s go back to our Full Triangulation diagram and change it up a bit:

No a-b Match, but Person a, b and c are IBD to a common ancestor

So let’s say Person a gets segment H1 from this ancestor and Person b get the other half H2 from this ancestor. One child of the ancestor gets H1 and another child gets H2. Somewhere down the line, two descendants of these children form a couple and have a child. The diagram above shows the couple as 2nd cousins but it could be any relation, even siblings (but that’s not nice).

The couple’s child (in this case Person c), will get a segment from Great-GChild 1, so it could be the H1 segment or GGChild 1’s other half segment from one of the other parents on the way up. Similarly the couple’s child will get a segment from Great-GChild 2, which will either be segment H2 or GGChild 2’s other half segment. There is a 1/2 chance Person c will get H1 and a 1/2 chance they’ll get H2 making a 1/4 chance Person c gets both H1 and H2. Let’s assume Person c does get both.

Now Person a’s H1 segment matches to Person c’s H1 segment and it is IBD. Person b’s H2 segment matches to Person c’s H2 segment and it also is IBD. But Person a’s H1 segment does not match to Person b’s H2 segment.

Yet, all three match to the same segment, albeit both halves of the ancestor.

Got it? In other words, these missing a-b double match segments can provide useful information.

Double matching gets some protection from chance matches like triangulation does. If Person a’s H1 match with Person c shares the same segment that Person b’s H2 match does with Person c, the likelihood that both are matching by chance to their other halves is very small. That is especially true since the people selected as Person a and Person b are usually known beforehand to be related. Therefore double matched segments likely approach the 5 cM threshold of Triangulated segments and can be mostly trusted down to that distance. 

For example, I have a few Chromosome Browser Results files for a few people that obviously have no relationship to my uncle. One person shares no triangulated segments and only shares 6 missing a-b segments with 4 people. Those 6 segments are only 1.84, 3.1, 3, 2.94, 4.18 and 3.43 cM, A second person only shares 8 missing a-b segments with 5 people that are 1.81, 2.87, 1.93, 2.39, 2.09, 3, 3.52 and 3.43 cM. The largest among these known-to-be by-chance matches is 4.18 cM.

Once again, other information such as multiple people matching and coincident crossover points and extended single match regions on either side of the double match region may help to identify which small segments are likely IBD – and we must leave this also to future research.

Here’s a second possibility that is very interesting:

No a-b Match, but Person a, b and c are IBD to a pair of ancestors

In this case, Person a gets segment H1 from this ancestor and Person b get the segment S1 from Person a’s ancestor’s spouse. But of course, the Ancestor is also an ancestor of Person b. Person b just didn’t get the H1 segment. And the Spouse is also an ancestor of Person a and Person a didn’t get the S1 segment.

So let’s pass the H1 and S1 segment down to two Great-Grandchildren who have a child together (Person c). Using the same probability logic as I used earlier, Person c has a 1/4 chance of getting segment H1 from one parent and segment S1 from the other.

Now Person a’s H1 segment matches to Person c’s H1 segment and it is IBD to the Ancestor. Person b’s S1 segment matches to Person c’s S1 segment and it also is IBD, but to the Ancestor’s spouse. Once again there is no a-b match because Person a’s H1 segment does not match to Person b’s S1 segment.

This type of missing a-b double match again is important and usable information that should not be thrown away. Small segment caveat. Multiple matches. Coincident crossover points. Extended single match region. Further research needed.

In the Double Match Triangulator output, below is what a Missing a-b Double Match looks like. The Double Match region is shown with green X’s. Each row represents one person (Person c) who double matches with Persons a and b. Some of these will be IBD, some will be IBS (by chance). The pink a’s and blue b’s show single matches adjacent to the double match area:

A Missing a-b Double Match as mapped by DMT

Hopefully this blog post will give you an insight to help you understand what DMT is displaying for you, and that all of the information presented may be valuable. And maybe you and I together one day, can figure out how to interpret it all and help us tell how we are all related.

Please let me know if you get any Eurekas about DMT and how to use it.