Login to participate
  
Register   Lost ID/password?
Louis Kessler’s Behold Blog » Blog Entry           prev Prev   Next next

The New Chromosome Browser Results File at FTDNA - Sun, 21 Oct 2018

A few days ago, Family Tree DNA released a new version of its chromosome browser. There are other people that have already described the improvements, including Kitty Cooper and Roberta Estes, but I’d like to focus on one item that’s changed that affects Double Match Triangulator users.


New Link to Download Segment Matches

The segment matches that you download for use in Double Match Triangulator are downloaded into a file that I’ll call the Chromosome Browser Results (CBR) file, since it is named: 

nnnnnn_Chromosome_Browser_Results_yyyymmdd.csv

where:

  • nnnnnn is your Family Tree DNA kit number,
  • yyyymmdd is the date of your download, and
  • .csv indicates this is a comma delimited file which can be read by Excel and other programs.

It contains a header line and one line for each segment match that you have with every person who you match to.

The way you download the file has changed. Previously, you used to go to the Chromosome Browser page and click on the “Download All Matches to Excel (CSV Format)” link that I’ve shown below highlighted in orange:

Now, you get it a different way. You must first go to your home screen and click on the Chromosome Browser box:

image

That will take you to the new Chromosome Browser tool page. Before you select anybody, scroll down to the bottom of your list of DNA matches, and you’ll see a “DOWNLOAD ALL MATCHES” link.

image

Click on that download link, and it will start to download your segment matches.

As before, you still have to be patient after you click the link. There is no immediate indication that anything is happening. I have a lot of matches (17,462 people with whom I have 347,193 segment matches) and I find it takes about 40 seconds for anything at all to happen and then a window pops up to ask what I want to do with the file, that it says is 19.9 MB in size.


File Format Changes

There are a number of changes to the file itself as well.

1. The file now has a Byte Order Mark (BOM).

The BOM is a few characters at the beginning of a file to tell programs that read the file what type of character set it has. The BOM that FTDNA added says that this is a UTF8 file meaning it can contain any Unicode character. The BOM is somewhat of a technical detail you don’t need to worry about, but what this BOM indicates is that the file may contain names and/or words written in almost any language. They sort alphabetically after English letters, so you’ll find names starting with non-English letters at the end of the CBR file.

image

If you have downloads of the CBR file prior to this, they did include Unicode text, but without a BOM or a program knowing this, the foreign letters would only appear to be gobbledygook:

  image

Now that the BOM has been added, text programs, Excel, and DMT all read and display the names correctly. I’m currently working to release version 3.0 of DMT, and I’ll make sure it will also read the names correctly as Unicode from older files you may have downloaded before FTDNA included the BOM.

2. They changed the file format.

The lines used to end with a Carriage Return and Line Feed (CRLF) which is the Windows file format standard. Now they end with just a Line Feed (LF) which is the Unix file format standard. I can’t imagine why they might have wanted to change this.

3. They removed the double quotes from the text fields.

They used to have the name of the person and the name of the match always in double quotes:

image

They’ve removed them and it now looks like this:

image

Either way is fine for csv (comma delimited files), but a program will need to be able to handle it both ways and give the same results. 

One case that causes problems is a name with double quotes in it, e.g. “Buddy” John Williams.  They had previously included this as:  “”Buddy” John Williams”.

If you load the former into Excel, it will display as Buddy John Williams without the quotes around the Buddy and it will no longer match the “Buddy” John Williams that the previous form gave you.

This change caused a very strange bug in DMT that took me two mornings of debugging and over 100 compiles before I solved it. I have to trace the problem step by step to discover that it was caused by the quoting.

4. They changed the header line.

It used to be:  

image

Now it’s:

image

Well all they really changed was from using Upper case to using Mixed case for the field names. That might not seem like much, but DMT used the first line to checking to see if you have a FTDNA segment file. It would be easy enough to simply uppercase all the letters and compare those … but then, they did add a space in “Match Name” in the new file as well.


The Effect of These Changes

The difficulty with writing programs that read in files produced by anyone else is that  the file format can be changed at any time and break a developer’s utility program. For a desktop program like DMT, you then have to wait for its next release and hope that the programmer noticed the changes and the program has been updated to handle them. (p.s. Thank you to all of you who report problems to me. If I don’t know about them, I can’t fix them.)

With regards to handling file format changes, I envy those who write online programs, because they can squeeze in a fix online at any time. Changing a packaged program like DMT is a bit more involved.

All utility programs are subject to the whims of the programmers and web developers at FTDNA, GEDmatch, 23andMe, MyHeritage DNA and AncestryDNA. Any time they make a change, they affect the utility programs that use their data.

I’m still working on version 3.0 of Double Match Triangulator. Corrections for these FTDNA file format changes will be included, which should allow files in the new format to be compatible and work with files of the old format.




Update: Nov 5, 2018:  Since I wrote this article 3 weeks ago, Family Tree DNA made one other tiny change to its Chromosome Browser Results file. They added a space after the comma but before the chromosome number on each line. This breaks Double Match Triangulator.


Update: Nov 28, 2018:  Here is a temporary fix to get the new format CBR files to work with DMT version 2.1.1. There are two possible methods:

1. If you have a text processor, open the CBR file with the text processor and change all comma spaces to just a comma, i.e. change all “, “ to “,”.  Then save the file.

2. If you don’t have a text processor, you open your file with Excel.  Select Column C (Chromosome) and open the “Find and Replace” box and change space-X to X, i.e. change “ X” to “X”. Then save the files as CSV UTF-8 (Comma delimited) (*.csv). This will remove the extra spaces that FTDNA has added and will convert their file back to standard .csv format.

image   

Fixes to handle FTDNA’s new format have been developed and will be in version 3.0 of DMT when it is released.


Update: Dec 10, 2018:  Less than 2 months after Family Tree DNA changed their Chromosome Browser Results file and the method to download it, they’ve managed to change the method to download it again.

Now, on their Chromosome Browser page, you click on the “Download All Segments” link that’s at the top right of the DNA Matches listing. You still have to do this before you select matches to compare.

Then you wait a while. It took about 30 seconds for me before anything happened and my browser said the file to download was available.

image

This is actually a good change. Where they first had it at the bottom of the page was not easy to find. Now it is readily accessible.

Since it takes a while to respond, they really should pop up a box right away saying: “Please wait while the download is assembled”. But unfortunately they don’t, so most people will press it 100 times to get it to work, when really only one time will do.

The instructions shown in the Update Nov 28, 2018 still must be followed to put the new file into a format that DMT can read it until Version 3.0 is released.




Update: Jan 8, 2019:  Version 2.9.5 of Double Match Triangulator has been released to handle the new Family Tree DNA format.

No Comments Yet

Leave a Comment

You must login to comment.

Login to participate
  
Register   Lost ID/password?