Login to participate
  
Register   Lost ID/password?

Louis Kessler's Behold Blog

The GEDmatch Relationship Tree tool - Wed, 24 Jan 2018

I hadn’t tried this one before. Have you?

image

It’s a Tier 1 tool, so you have to pay GEDmatch the $10 a month to use it.

What an imaginative innovative idea. They’re using expected autosomal sharing and expected X sharing between two people to determine relationships and they show a tree for the two people. When I run myself against one of my closest X matches on GEDmatch (28 cM shared X, 42 cM shared autosomal) who I have no idea how I’m related to, after increasing the overlap from 1 (which gave no results) to 2, I get this:

image

What it tells me, is that using rules of how autosomal DNA and X-DNA is passed from parent of a given sex and child of a given sex, my shared amounts with this person most closely match the estimates for two relationships (as shown in the chart above). This person is most likely my 3rd cousin in one of these two ways:

  1. My mother’s father’s mother’s sister’s son’s daughter’s daughter.

  2. My mother’s father’s mothers’ sister’s daughter’s son’s daughter.

This doesn’t preclude other ways that we could be related. I could increase the overlap to 3 and get more possibilities. But it does give some insight as to what the relationship might be.

At GEDmatch they say this tool is experimental. It’s been around a few years and there is very little documentation about it. The best writeup I’ve seen is one by Israel Pickholtz in 2014.

How does it work?

Well that exact question was asked on my favorite Genealogy and Family History Question and Answer Site, and you can see my answer there.

Does the tool help? Well. I’m going to email that person, and ask her if her who her mother’s father’s mother and her father’s mother’s mother were and where they were from and see if they can connect to my mother’s father’s mother.

With regards to using relationship techniques like this in genealogy software like Behold or DNA analysis tools such as Double Match Triangulator, I have some ideas. Stay tuned.

—-

Update Jan 25:  I contacted the person administering the account who was an APG member administering lots of DNA kits. He pointed out that I didn’t read the instructions correctly and entered the wrong values. It wanted autosomal and X values from the 1 to 1 tool and I used the values from the one-to-many tool. He also said the person I matched to was likely a generation different from me. That doesn’t affect anything I say in the post about the tool, but only means it is using example input.

So I entered the correct values into the Relationship Tree tool and it now gave a single result, not too much different than what is in the diagram above. At 3rd cousins once removed, the line we might be related to is at least one generation further than either of us have been able to genealogically research.

None-the-less, try the tool. It may help you if both you and your match have genealogically researched enough generations back on the possible connecting lines.

Revisiting Missing A-B Matches - Thu, 18 Jan 2018

I’ve been working away the past couple of months to finish off Version 2 of Double Match Triangulator. There’s lots of major improvements, and I’m hoping to find ways to get it to do some of the analysis for you as well.

What Double Match Triangulator does differently than other DNA matching tools, is that it uses not just the segment matches that one person has, but uses the segment matches of two people together. Let’s call yourself Person A, and the one you’re matching with Person B. Double Match Triangulator compares the the segment matches Person A has with other people to the segment matches Person B has with other people. Any that overlap are called a Double Match.

For example, Person A matches Person C on Chromosome 1 from position 72017 to position 7238701. Person B matches Person C on Chromosome 1 from position 3528942 to position 12148559. They Double Match between positions 3528942 and 7238701.

The magical thing about this Double Match is that now we just can look in  Person A’s matches to see what segment matches he has with Person B (or we can look in Person B’s matches since both will show the same matches with each other). If Person A matches Person B over some part of the Double Match segment with Person C, we have a triangulation: Person A matches Person C, Person B matches Person C, and Person A matches Person B – all over the same segment.

If Person A and Person B don’t match, we have what I call a Missing A-B match. We still have the Double Match of Person A with Person C and Person B with Person C.

You would think a Double Match that does not triangulate should just be considered a random match. I can match someone, and you can match the same person but we don’t have to be related. I can be a cousins of Person C on their dad’s side and you can be a cousin on their mom’s side.

But when using Double Match Triangulator, we are choosing the B people to be people who we, Person A match to. With Person B being a DNA relative, the Double Matching of Double Match Triangulator is more powerful than the matches you see lining up on the same segment in a Chromosome Browser. In the Chromosome Browser, you have no indication of whether the people lining up with each other are related to each other or not. But in Double Match triangulation, you are picking the B people that are related to you. It is like using a Chromosome Browser where you are just including people that are guaranteed to be related to each other. That’s one of the things that makes Double Matching so useful. Of course the other thing is that Double Matching will find in one fell swoop EVERY TRIANGULATION that two people have between them.

The important thing about a triangulation is that all IBD (Identical By Descent) segments that are passed down from a common ancestor must triangulate. By looking only at triangulating segments, you are eliminating many segments that are false (guaranteed not to be IBD) and increasing your chance of finding IBD segments.

If we didn’t match to Person B, then none of the Double Matches would triangulate and en we are no better than using a Chromosome Browser and verifying the Person B-C matches one by one.

Back to the Missing A-B match

Unless you Double Match, you won’t find any Missing A-B matches. When looking in a Chromosome Browser as Person A, you’ll find Person B and Person C that overlap. You’ll then find out from either Person B or C if they match the other person on that segment and if they do, they triangulate. If they don’t, then you have a Missing B-C, but that is not a missing with you, Person A.

By Double Matching, all the B-C matches are verified in advance. If Person A also matches Person B, then you have a triangulation. If not, you have a Missing A-B match.

Missing A-B matches therefore are Person A matching Person C and Person B matching Person C but Person A not matching Person B on the same segment but with the important caveat that you also know that Person A and Person B are related, and they match and triangulate on other segments, but not this one.

Missing A-B matches are not triangulations. Therefore they cannot be IBD. So what possible use then can they have?

It took me a while to figure this out. In fact I was going to take the display of Missing A-B matches out of Version 2 of DMT and I even took all the code out of the program that displayed them. But after doing some more work, I realized they were important, and put that all back in.

What got me thinking was the number of Missing A-B matches there were. There were often as many if not more Missing A-B matches than there were triangulations.

When I last puzzled over this, about six months ago in my Triangulation and Missing a-b Segments article, I came to the conclusion that Missing A-B matches could point to a common ancestor. But the way I had figured it, I thought that it required the parents of Person C to be both descendant from the same common ancestor or ancestor pair in these two possible configurations:

But I never really thought that either of these two scenarios would be so plentiful to give so many missing A-B matches. Maybe in a endogamous population. But in my test runs even non-endogamous populations are loaded with Missing A-B matches. So what gives?

I finally figured out another case. Here are some results from a run of DMT using close relatives:

image

Here we have a daughter as Person A, double matched with her Uncle as Person B, who is her mother’s brother. They match each other between 72017 mB and 7238701 mB shown on the Base AB line.

On the next line, the daughter of course will triangulate with her mother and her uncle because the daughter half-matches her mother everywhere, and her mother matches her uncle because it is the mother who passed down that segment to the daughter. Therefore they triangulate over that segment. That segment came from the common ancestor who would be the grandparent who passed that segment down to the uncle, the mother and the daughter.

On the third line, we pick the brother of the daughter as Person B. The daughter we already know matches the uncle on that segment. And her brother does as well. They Double Match. But the brother does not match his sister on that segment. What happened in this case is that mother passed down the other grandparent’s segment to the son. Mother still matches the son, but the son and daughter have different grandparent segments from their mother. The brother matches his uncle on the other grandparent’s segment. This is a Missing A-B match. The diagram looks like this:

image

This can happen at further relationships as well. For example at first cousins, we can have Cousin 1 as Person B who triangulates with the daughter and the uncle (Person C), and Cousin 2 who as Person B still double matches but doesn’t triangulate because he has a missing A-B match, as shown in the 4th and 5th lines of the spreadsheet above.

The diagram illustrating this cousin situation is:

image

Person A, the daughter matches her uncle on H1 and her Cousin 1 on H1. Her brother and Cousin 2 both match her Uncle on S1. The daughter matches Cousin 1 on H1 but does not match her brother or Cousin 2.

So we have a situation here where a Missing A-B match occurs that is not caused by having parents related and descendant from a common ancestor as I had surmised in my previous article. This new situation is likely much more common and is probably the reason why there are so man Missing A-B matches.

In fact, it can happen with further relationships than 1st cousins. Here’s an example with a 2nd cousin and 3rd cousin as Person B resulting in a Missing A-B match with the daughter as Person A and the uncle as Person C.

image

And just so you don’t think that Person B or Person C necessarily have to be a close relative of Person A, they can be close relatives of each other instead. For example below, the Son is Person A, the 3rd Cousin is Person B, and the Uncle of the 3rd Cousin is Person C:

image

’This does require that whoever is Person C, receives both the common segments, one from each parent, so that it has one that matches both Person A and Person B .

The bottom line is that a Missing A-B match is indicating that Person C who matches both Person A and Person B could have a common ancestor with Person A and Person B.  The common ancestor would have passed a segment down to Person C and also to either Person A or Person B.

What’s important to understand out of all this is that, although they are not IBD to Person A, Missing A-B matches may still be indicate the possibility of a common ancestor with Person C and they should not be ignored.

Chess and Artificial Intelligence: The Future Changed Today - Wed, 6 Dec 2017

The shocking and unbelievable news hit my inbox today, and I just couldn’t be more amazed by what has just happened and what the future effects will be.

Today it was announced that DeepMind, a company formed in 2010 and purchased by Google in 2014 to investigate the possibilities of using neural networks for machine learning, has created a program that given just the rules of a game, can play itself and learn and reach champion levels.

Everyone remembers Kasparov being defeated by the chess program Deep Blue in 1997. Go is a tougher game for computers. In March 2016, DeepMind created a program called AlphaGo, that defeated the world champion Lee Sedol.

But in just a year and a half since then, in October 2017, the algorithm was able to achieved superhuman performance in the game of Go using only 8 hours of training:

The next month, the algorithm achieved superhuman performance in the complex Japanese board game of shogi with 2 hours of training, and then it trounced Stockfish, one of the top computer chess programs in the world after just 4 hours of training.

AlphaZero played 100 games of chess with Stockfish and as white, scored 25 wins, 25 draws and no losses. As black, it scored 3 wins, 47 draws and no losses. In so doing, AlphaZero played classic human openings with no opening knowledge programmed into it. It had no endgame database. It had no heuristics. It learned everything itself.

Long, long ago, when I was a student at the University of Manitoba, I had a hobby I had dabbled in: programming a computer to play chess. I had reached a point where my program, Brute Force, was then one of the best in the world. I went to Seattle, Washington in 1977 for the 8th North American Computer Chess Championship, and followed that up in 1978 in Washington, D.C. for the 9th NACCC. (If you’re interested, see my writeup on my chess program, Brute Force).  

The program was called Brute Force because I concentrated on doing the minimum possible to evaluate positions, and simply let the program iterate as many moves as possible to determine the best move. I had the full use of the University of Manitoba’s IBM 370/168 mainframe, which likely was as powerful then as your smartphone is today. Smartphones today can play better chess than the big computers did back then in the Computer Chess Championships of the ‘70s.

Soon after, my programming interests switched to genealogy, but I remained interested in and followed the advances in computer chess and artificial intelligence in general. The best computer chess engines today, Houdini, Komodo and Stockfish, play at an ELO rating of 3400. The best chess player in the world today is Magnus Carlsen of Norway rated at about 2840. The chance of Magnus defeating one of these programs today is the same as a person rated 2280, say the 6000’th best chess player in the world, beating Magnus.

Stockfish and these other programs have been worked on continuously for a long time. They encompass 20 years of improvements in computer speed, better chess-specific algorithms, and inclusion and refinement of chess ideas since Deep Blue of 1997.

Yet, in only four hours of training, this program AlphaZero, defeated Stockfish handily.

But that’s not what is most amazing.

What is most amazing is that AlphaZero trained using a general game-playing learning algorithm that had zero chess-specific knowledge, other than the rules of how the pieces could move and what was a win, loss and draw.

See this Chess24 article by Colin McGourty for the chess viewpoint on this, along and a link to the paper written yesterday (Dec 5) on this breakthrough:

 

What Has Just Happened?

DeepMind has developed a general method where a machine can learn how to gain world-level expertise in any task once the rules are set out.

This will change everything. Think more than just board games. Think bigger.

Not long ago, I saw this tweet with a video of a robot:

Imagine now what would happen if we could program the “rules” of walking, and gravity and motion into it, and throw it through this learning algorithm. You would likely end up with a robot that could run, jump, swim, whatever at hundreds of miles per hour without disturbing anything in its path. (Notice the text of the tweet: “we dead”)

Let’s go further: Computer voice recognition. It’s terrible right now. It needs training and still makes lots of mistakes. This algorithm will make mincemeat of that problem.

Language translation: So easy now.

Computer vision, hearing, tasting, smelling. Now doable.

Artificial intelligence … Now it’s very scary. We had better do this right.

That’s enough. Time to breathe again and go back to genealogy programming.