Login to participate
Register   Lost ID/password?

Louis Kessler's Behold Blog

DNA or Bust - Sat, 7 May 2016

It was Judy Russell @legalgen on our recent Unlock the Past Genealogy cruise who implored me to get my 93 year old uncle DNA tested while I still can. Judy made me realize that it was necessary.

I have always understood the importance of DNA research as a valuable science to help discover relationships that works hand-in-hand with traditional research. For the past few years, I have been learning everything there is to learn about DNA, I’ve become a member of the ISOGG and keep up with the latest advancements by following a good number of genetic genealogists through their blogs and twitter accounts. Whenever I have questions, I also go to my daughter who finished her University degree in genetics and microbiology who can set me straight.

Over the past 20 years, doing my own genealogy has taken a back seat to my day job, my family, and my development of Behold. But I realized Judy was right. Similar to talking to your oldest relatives about everything they know, doing this was different and time-urgent. My uncle is the last of my living uncles and aunts either my father or mother’s sides. This was my last chance to get a directly connected generation further back that would help me in my future research to isolate my father’s side from my mother’s site once I get around to doing a DNA test of myself and starting to ask other relatives if they would do so.

I was curious as well. My father’s (and my uncle’s) parents both originate from a small region in what is now north eastern Romania. Unless something unexpected happened somewhere along the way, their parentage should be 100% Ashkenazi and that would likely bring into play all the DNA puzzles associated with endogamy.

So I met with my uncle, I got his permission, I ordered the FamilyFinder, Y37 and mtDNA+ tests from FamilyTreeDNA. We waited a few weeks for it to come. My uncle swabbed his cheeks and we sent back the kit for analysis.

Yesterday I got an email from FamilyTreeDNA that my uncle’s autosomal results were available. How many cousins would it find?  I was hoping for at least a few. To be honest, I didn’t expect a lot. How many did it find? Well, would you believe 7,017?

Of those, 126 were suggested as 2nd cousins, 226 as 3rd cousins, 879 as 4th cousins, and the remaining 5,900 as having significant but remote relationships further than 4th cousins. For those who understand what the following means, the average match was 77.6 cM (centimorgans) with a maximum of 168.9 and a mininum of 20.0. The longest matches averaged 10.5 cM (maximum 32.7, minimum 6.9). I do understand that ancestor collapse occurs in endogenous populations and the number of cM for a match may indicate a relationship closer than the true relationship. FamilyTreeDNA does state that they adjust for this: “Beginning on April 21, 2011, we have modified our Family Finder matching algorithm to address this. The changes affect the match list for Ashkenazi Jews. The outcome is calculated Family Finder relationships that more accurately reflect relationships to other Ashkenazi Jews.” What they don’t state anywhere is whether or not they’ve applied this rule to my uncle’s results.

So let’s see what they consider my uncle to be via FamilyTreeDNA’s “my Origins” page. Will there be any surprises?


Sort of what was expected. A bit of middle eastern can originate from the deep roots of the Ashkenazi people who were in the Middle East 2000 years ago. I could see the 2% European being base people who were used for FamilyTreeDNA’s ethnic makeup who had some Ashkenazi ancestry in them but didn’t realize it.

So the next task was to see how many of those 7,017 matches I recognize as people who I know are related on my father’s side.

My first surprise was that the 3rd person listed was a fellow researcher on my father’s mother’s father’s side. I have communicated with him many times in the past number of years and we have shared much information about our common line and determined, but not proved, that we are cousins. He shows up as a suggested 2nd cousin of my uncle, and that we believe that to be true.

My second surprise was that none of the other 7,016 matches, even the others listed as 2nd cousins, were known to be related to my uncle (or me) on my father’s side.

But there were other people among the matches that very much surprised me:

1. Listed 142nd as a 3rd cousin is Brooke Shreier Ganz, a fellow genealogy software developer (of Leafseek) who I met at RootsTech 2014. There were another 5 people in my matches that are submitted with Brooke’s email address. We must be related, but I don’t yet know how.

2.  Listed 537th as a 4th cousin is someone who I’ve shared a lot of information with about our common families. The trouble is that the shared information is about my wife’s family, not my father’s family. I have no idea how he might be related to me.

3. In 1,495th spot is Gary Mokotoff, the publisher of Avotaynu, the International Review of Jewish Genealogy. He is a big name in Jewish Genealogy and knows a few things.

4. In 2,003rd spot was Israel Pickholtz. He is one of the speakers I had planned to see at the Ontario Genealogical Society conference on June 5th.  I’m half way through his book: Endogamy: One Family, One People. At the back of his book, he has 3 pages (2 shown below) of the people who’s DNA was used for his projects. I went through his list and found almost half of them (marked in red below) are in my list of matches, and some of them were listed as high as 2nd and 3rd cousins, so we must somehow connect. Now I’ll have a lot more to talk to Israel about when I see him in Toronto.


5. Finally, last but not least, in 2,364th spot is Lara Diamond, who I was planning to attend two of her talks at OGS on June 5. Interestingly, Lara recently blogged about her finding a connection to Israel’s family. I went to Lara’s blog post on How Endogamy Looks in Practice. Almost all the people Lara lists on that page are in my matches, and 6 of them are shown as 3rd or 4th cousins. Lara and I will also have lots to talk about.

The FamilyTreeDNA listings include ancestral surnames and places as well as ancestral trees for the people who have entered them. Other than the one relative I know of, only about a dozen of the 7,000 matches listed any of the ancestral names and places that I have already researched on my father’s side, so at the moment, I don’t know how I’m related to any of these people.

I will contact the people I mentioned above regarding our connections, and I’m going to see if I can figure out ways to make chromosome matching between 7,000 people a bit easier. This will be a very interesting adventure as I start to sort all this out. This insight could also allow me to develop some more ideas for DNA tools that I can add to Behold. I’ll keep you posted.

As CeCe Moore said in a Roots Tech 2015 interview: “People who might not think they going to find anything will be very surprised.”

OGS Conference, Sunday June 5, Toronto, Ontario - Sat, 7 May 2016

@OGSConference. I had noticed that the Ontario Genealogical Society was holding their annual Conference in June. The Sunday program had several speakers I really wanted to hear. Lara Diamond will be talking on “Movement Between Towns in Eastern Europe” and “Jewish Genealogical Research in Ukraine”, and Israel Pickholtz will be talking about his successes in his DNA research in his endogamous family, and I’m currently half way through reading his new book: Endogamy: One Family, One People. To top that off, Sunday has a panel discussing “The Future of Genetic Genealogy” with panel members: Elizabeth A. R. Kaegi (moderator), Maurice Gleeson, CeCe Moore, David Pike and Judy Russell. The closing keynote for the day is CeCe Moore with “Lessons from the Cutting Edge” of genetic genealogy.


I didn’t have time in June to attend the full Conference. But with a lineup like that on the Sunday, I just couldn’t pass up that one day of the Conference. And I knew it would be great to meet up with Judy again, who I had spent a few weeks with a few months ago on the 10th Unlock the Past Genealogy cruise. I booked a flight to Toronto on Saturday June 4, the Conference hotel for the one night, registered for the one day of the Conference, and then my flight home to Winnipeg on Sunday night.

The Chair of the Conference, Paul Jones, noticed that I had registered and recognized my name as the author of Behold. He asked if I’d be interested in becoming an exhibitor or giving a pop-up presentation or becoming a sponsor. Due to my tight schedule for the day, becoming an exhibitor or giving a presentation did not work well, but I did like the idea of becoming a sponsor. So I’m proud and excited to announce that:

Behold Genealogy will be sponsoring
the panel discussion: Session 39
“The Future of Genetic Genealogy”
on Sunday June 5, 1:15 pm
at the OGS Conference, Toronto, Ontario

So that’s now a day I’m really looking forward to. It will be great talking to Behold and GenSoftReviews users and meeting for the first time many people I know and who know me only from the web.

Unexpected BNF - Mon, 2 May 2016

Just before I released the last version of Behold a month ago, a user asked if there was a way to display all the burials and nothing else. They wanted to check out all the cemeteries in a city they were going to.

I thought Behold could do that already. In the Organize Tags page, I allow selection of which tags you want to display. I thought it should be possible because you can select the BURI tag and unselect everything else, like this:


But it didn’t quite work right. First, subordinate tags to the BURI tag would be hidden, so you’d have to find them and select them again. Doing so would sometimes make some unwanted data show up elsewhere.

Also, I found the tag hiding was not working in the Place details and the Source details. And that’s where it was needed most. If it would have worked, it would have produced very useful information in Behold’s Everything Report that would have looked like the following screen shots.

First, the person info would still show everybody, but only their burial facts:


Then for places, only those places having burials would display:


Then the sources would only show those that were claimed for burials:


Having this feature would allow not just a customized listing of burials, but would allow lists of any desired fact or combination of facts. For example, maybe you had access to some school service or military service records and you want a concise listing of those to refer to.

So I decided to include the ability to hide or display facts as part of the (hopefully) final set of changes to the Everything Report prior to implementing saving and editing.

When I started implementing this a couple of weeks ago, I immediately realized that specification of the tags to include and exclude did not work at the tag level. I needed to make that specification work at the fact level. That way, each fact type (e.g. birth, education, marriage, death, burial, etc.) could be shown or hidden as desired, and the same can be done for those fact types in the Place Details and Source Details. Then I’d add a checkbox to deselect (or select) all facts so that it would be easy to select the 1 or 2 facts that you wanted shown (or hidden).

I got that working and it didn’t do too bad. It produced the above screen shots when burials were selected.

I was able to do this fairly simply by extending my tag definitions in my Delphi code. My previous code that only allowed the selection of tags looked like this:


This code set up each tag, the text that would display, whether this tag was hidden by default or not, and which versions of GEDCOM this tag was valid in.

Some of the tags, especially in the HEAD record were shown with the record name and the tag name using a period to separate them, e.g.:


Now I’d be separating out the facts from these tags so that only the fact tags could be selected. I decided I could identify these if I prefixed them with the record tag they came from, which would only be INDI (individual) or FAM (family) records.  So now the facts section of this code started to look like this:


The tricky part of all this was picking out which of the tags were facts one level under the INDI and FAM record. There are some tags, such as SOUR and NOTE and OBJE that can be both at level 1 describing the person, but also can be at level 2 describing a fact. There were also some odd things, such as CHAN, the record change date tag, the ID number, and additional information Behold supplies at the fact level that needed to be handled so that only the desired facts would display and nothing else.

So to rigorously ensure I had all the facts, I had to go back to the GEDCOM 5.5.1 standard and work my way through it and pick out all the fact tags, and all the detail tags and the ones that could be used as either a fact or a detail.

After a couple of successively improving attempts at this … Eureka!  I realized something. I had in place the structure I needed to do an automated and effectively perfect parsing of a GEDCOM file. The GEDCOM standard is constructed using a grammar known as Backus-Naur Form or BNF. It defines what constructs are allowed. An excerpt of it from GEDCOM looks like this:


The items in double angle brackets are subordinate structures, e.g.:


and the items in single angle brackets are the data tags, e.g.:


Unexpectedly, I had figured out how to simply enter the GEDCOM BNF notation into my code, and Behold will do the checking to ensure that the input conforms to the GEDCOM standard.  My code will now look like:



What this does is allow me to just about take the GEDCOM BNF and copy it directly into my own code. There is no translation or mapping I need to do so it is relatively painless and less error-prone.

I will add extra parameter calls to my SetDefaultTag routine for the minimum and maximum number of occurrences of each construct and the minimum and maximum size of the data values, which will allow my routine (to be renamed to LoadBNF) to automatically check those limits and issue a message if the GEDCOM is not proper. The really nice thing is that I’ll have the actual GEDCOM structure name coded, so for any error messages Behold will be able to display the exact structure name the error is in, e.g.:

<<NAME_PERSONAL>> more than 120 characters

I was planning to implement complete GEDCOM checking in a few months from now once I started working on GEDCOM 5.5.1 output. I had looked previously at Delphi implementations of BNF, and was not looking forward to the task of either adapting them or writing my own. Believe me, writing a grammar parser is not fun. That’s likely because it’s something they make you do in 2nd year Computer Science classes as a lab assignment. So I’m very pleased. I never expected the methodology to do this would just fall in place so conveniently.

The beauty of this coding structure is that I’ll be able to go through previous GEDCOM versions (5.5, 5.3, 5.0, 5.5EL, FTM Text) quite quickly and incorporate full checks of those structures as well.

This will also extend to similarly structured grammars that are also made up of hierarchies of tags and values. This includes JSON Schema, which means I can use this for reading Behold’s own file format when I develop that along with editing. I’d likely also be able to quickly develop the input routines for FamilySearch’s GEDCOM X when the time comes to do so.

This is what programming is about and what makes it so much fun. You build up a structure and methodology in small steps and it evolves into something that you never expected and amazes you.

If you got to this point in this post, thanks for lasting through all this technical jabber. I had to spout this off. Now I feel better.

I’m working hard. Lots of great things to come.