Login to participate
  
Register   Lost ID/password?

Louis Kessler's Behold Blog

Double Match Triangulator (DMT) 1.0.1 - Sun, 21 Aug 2016

I’ve now released my new freeware program to provide a new view to help people analyze their autosomal DNA matches from FamilyTreeDNA. It is called Double Match Triangulator. I actually released version 1.0 a couple of days ago, but fixed a bug and version 1.0.1 is available at www.beholdgenealogy.com/dmt.

I’ve already blogged a few times in the last few months about it, so I’ll keep this one short and sweet. What the program does differently is that it combines all the matches of two different people. Any matches that also coincide with the two matches are, by definition, triangulated.

The program creates a Excel file that includes all matching segment boundaries along with a Map that allows you to visually look for patterns

When I first thought of developing this, I thought it would help identify common ancestors and allow me to sort out all my uncle’s 8,000 matches. Well, it’s not quite that simple.

Triangulation does not guarantee Identical By Descent (IBD) matches that indicate a common ancestor who passed down that segment of DNA. Small segments under 7 cM still have a good possibility of randomly matching by chance, even if they are triangulated.

I’ll need some experienced genetic genealogists to take this tool and figure out what can and what can not be determined from it. If straightforward analysis methods are developed using it, I could program those in and let the program do some of the analysis for you as well.

I’m still optimistic that the Double Match method of looking at autosomal results will lead to identifying family relationships. The segment boundaries (crossovers) are created by one ancestor and passed along several generations until they gets wiped out by another parent’s segments, and I’m betting that those crossovers might produce a trail that can be followed to connect all the family members together.

With regards to the random matches, means of separating those from true segments may be forthcoming. In a way, triangulation is akin to phasing in that it helps identify the side of the family the third matches belong to.

And advanced use of multiple DMT files using 60 different people in all combinations as the Person a and Person b may reveal more than we can now imagine. There is a wealth of information here, and a lot of potential.

So if you’ve done some autosomal DNA match analysis and you have access to at least two person’s autosomal DNA results at FamilyTreeDNA (and if you use Windows and Excel), feel free to download DMT and try it out. If you come up with great ways of using it, please let me know.

It took a few months and a couple hundred hours of my time to develop DMT. I am making it available free because I want people to use it.

I was half way through getting Version 1.3 of Behold when I got distracted by the need to explore the DMT concept. But I needed to explore autosomal DNA and I know a lot more now than I did before. Sorry for the brief interlude. We’ll now get back to our regularly scheduled Behold development.

Writing Freeware (Double Match Triangulator) - Sun, 17 Jul 2016

Most people might think releasing a freeware program is easy. Just write it and make it available. Right?

Well, there’s a bit more in it than that.

When I came up with the idea for Double Match Triangulation of autosomal DNA using the chromosome match files produced by FamilyTreeDNA, I knew I’d need a program to sort all that data out. And when I went online to see what there was, and there was nothing like it, I knew I’d have to create it and make it available so that others can use it too.

I first figured out what was needed by doing the matching with Excel. I loaded two chromosome match files into Excel, merged them together, and developed equations to determine segment overlaps. I then used conditional formatting to color the cells to make interpretation easier.

Once that template was set up, it wasn’t too much work to build a program with an engine that would read in two chromosome match files, compare them the same way I was doing in the Excel spreadsheet, and output the results to a csv (comma delimited) file so that Excel could read it in and display it all nicely.

 

So at that point, just a few little things to do:

1. Blog about the technique.

2. Get a few sample files from people so I can test it.

3. Test it, and find problems with the input files and handle them.

4. Learn from the results, and figure out more that can be done.

5. Decide what will be in the first cut of the program.

 

Basically the program is done…. Except it’s not.

6. Mock up a user interface to allow selection of files.

image

7. Include Open File dialogs to select the files

8. Include Open Folder dialogs to select the folders. Wait, there aren’t any Open Folder dialogs available in the Visual Controls Library. I have to research my options, see what I did in GEDCOM File Finder, and decide how to implement this.

9. Save past files and directories to the Registry so that they can appear in the  recently used list. (You’d hate me and I’d hate myself if I didn’t do this.)

10. Add error checking of file names and input files.

11. Figure out what to put in the status box and log files to track what was done and what wasn’t and any errors encountered.

12. Realize it’s easy to export to csv, but a pain to manually format it once you load the csv file into Excel. So I look for a way to automate the loading of the Excel file directly.

13. Try to make sense of the Office Developer Documentation and find the commands needed amongst the millions of articles.

14. Spend a week implementing the automation, and once it is working, realize it takes 10 times longer than creating the csv file.

15. Puzzle about ways to improve this slowness while in the shower, on my bike and at 3 in the morning.

16. Try various things, and find that creating a temporary csv file and then automating its input is 5 times faster than direct to Excel automation.

17. Rewrite everything so that multiple files can be matched at once.

18. Make sure it all looks nice, still works, and does what’s needed.

 

All done now? Yup. Except left to do:

19. An installation script for it.

20. Webpage for it so there’s someplace to download it from.

21. Some documentation would be nice.

22. Blog posts, announcements

 

Yay! Finally done.  … but forever followed by:

23. Support, bug fixes, response to questions, enhancements

 

So that’s how a freeware program is made. And the timeframe is after work, in the evenings and on weekends when not on errands, when your family lets you be alone, and when you’re not too tired to think.

Hopefully the Double Match Triangulator program will be available in the next week or two for anyone to try out.

Misleading Double Entries in FamilyTreeDNA Data - Thu, 7 Jul 2016

(This article was revised 11 Aug 2016 to fix some incorrect statements)

Be careful if you’re triangulating at FamilyTreeDNA. I just found out they can match twice on a segment.

If you look in your Chromosome Browser Results file which is downloadable from the Chromosome Browser page, you may find matches with a second person that overlap. For instance, look at this match my uncle has with David:

clip_image002

On chromosome 5, 7 and 12, there are three matches that overlap. The matches on chromosome 7 are identical. This would seem to indicate that one half of my uncle’s chromosome matches with one half of David’s chromosome and the other half of my uncle’s chromosome matches with the other half of David’s chromosome.

You don’t notice this when you use the chromosome browser. It will show just one of the matches:

SNAGHTML245e42a2

This doesn’t happen often. There are only 198 overlapping matches out of the 178,955 matches in my uncle’s file. But that’s often enough to worry about.

The match of my uncle with David is reported on the Chromosome Browser as having 26 shared segments totalling 87.36 cM. On the Family Finder it is reported as 107.87 cM, and in the chromosome match file I downloaded, after including the 3 overlapping segments shown above and 2 others, there are 59 matching segments totalling 203.56 cM. So what is going on here?

The overlapping matches in the Chromosome match files are not separate matches by FamilyTreeDNA on the two halves of the genome. They don’t do that. Any overlap would look like just one match over the both genomes.

What most (if not all) of those overlapping segments are from are from the incorrect way Family Tree Maker is listing people in the chromosome match file. They are being merged by person name and then by chromosome number and then by location on the chromosome. If two people have identical names, their information is being put together as one in the chromosome match file. This is incorrect and needs to be fixed by FamilyTreeDNA. What they need to do is incorporate the kit number into the matching, so that three John Smith’s are not put together.

See also my recent post: FamilyTreeDNA’s Chromosome Match File for more problems with the file that FamilyTreeDNA needs to fix.

And, if you hadn’t noticed, FamilyTreeDNA made some major changes today and updated their Family Finder interface. They now phase your relatives and show which matches are on your fathers side, mothers side, or both sides. Of course you need more than just one person tested and some known relationships entered before a paternal and maternal side can be assigned:

image

I also notice they changed the ordering. Your matches are now ordered first by relationship range and then by shared centimorgans. It used to be ordered first by relationship and then by largest segment. As a result, all the matches changed order significantly. But it seems that the relationships and cM values did not change.

For more information about this set of FamilyTreeDNA changes, see Roberta Estes’ post: Family Tree DNA Introduces Phased Family Finder Matches