Login to participate
  
Register   Lost ID/password?

Louis Kessler’s Behold Blog

The Best Genealogy Software Reviews - Sun, 14 Dec 2008

There is a recently redone site by Tamura Jones called Modern Software Experience that has posted the most comprehensive and accurate reviews of Genealogy software that I have ever seen. He doesn’t pull any punches and tells it like he sees it which is something this field of programming really needs.

So far, he has reviewed Family Tree Maker 2008 , 2008 SP3 and 2009, Family Historian, Long Family History, It’s Our Tree, FamilyTreeFactory, MyBlood, Legacy, Heredis, … the list goes on. The reviews are detailed and technical.

He runs each program through two sample GEDCOMs: A 1 MB GEDCOM with close to 5,000 individuals, which is a typical large family tree, and a GEDCOM of 100,000 individuals that acts as a real torture test, but is something that all programs should handle. You have to read the comparisons, because they are a real eye-opener. It gives me some real goals to attain with Behold.

In addition, the detailed reviews point out many other things that many programs do badly that I know I have to avoid doing in Behold. So I just love the resource that Tamura has put up.

Tamura has personally conversed with me and critiqued Behold quite accurately. And I’m so glad he did because many of his comments and suggestions have led to major improvements, including my effort right now to optimize the speed of Behold and reduce its memory usage - the current two Achilles heels of the program.

He’s got a lot more than just genealogy reviews up there. He comments on Microsoft technology, Standards, and WebBrowsers. And he has really interesting analysis of current events with Genealogy software, such as “The Family Tree Maker Book Building Promise“.

So take a few hours, or a few days (there’s that much there) and peruse the material at Tamura’s site.

Note he’s using pure web standards for his pages, and they are in XHTML format. Internet Explorer won’t work there unless you follow his instructions, or use a standards-compliant browser such as Firefox or Google Chrome.

It’s All the Rage - Tue, 2 Dec 2008

I’ve been at a conference for a couple of days. It’s on for a few more. But I’m doing it from the comfort of my computer at home.

Embarcadero, the new owners of Delphi, are hosting the 3rd CodeRage conference from Dec 1 to 5. It is just like any other conference with one difference. It is online.

This is the first time I’ve done one of these. It’s got Keynote presenters, two meeting rooms, a Virtual exhibit hall, and they even give you breaks between presentations and for lunch, just like a real conference. The presentations are pre-recorded in power-point style, but the presenters are available live after each presentation to answer questions in a chat room.

People all over the world who use Embarcadero tools, which includes Delphi, are meeting online for virtual presentations that are of interest to them. I’m personally interested in some of the Delphi presentations.

One Monday, I listened/watched two of Marcu Cantu’s presentations on Delphi 2009 - Unicode and What’s New. Following that I posted a couple of questions to him, and I was a bit surprised that he did not know what the ANSEL character coding was, as I was asking if Delphi 2009’s TEncoding class handles it.

I also purchased his new Delphi 2009 book which he just made available at the start of the conference. I was one of the first 60 so I got the $8 off the price, but I was very unpleased to see that Lulu charges an abhorrent $16.43 for Ground shipping after first suggesting I pay $74.26 for standard shipping to Canada. Maybe I should pay their $134.60 for their “Super Mega Fast” shipping which will arrive the very next day … after you wait 3 to 5 days until the book is printed. Hmmm. $16.43 for 8 to 12 days or $134.60 for 4 to 5 days. That’s some sort of scam Lulu’s got going there. Give me the book and I’ll hand deliver it for $134.60!

But I’m still looking forward to a few more presentations before the conference ends Friday, watching it on one monitor, while working on Behold on the other. Sort of wish I had a third monitor now.

What the Hash - Thu, 27 Nov 2008

With optimization of the “smaller” 95,000 person GEDCOM resulting in a load time of less than one second, I didn’t think that simply counting the number of tags by type of tag should add 2 additional seconds to the time.

There are over 1.4 million tags to count into 100 different tag types. I also count separately whether they are record tags, link tags, or data tags. I’ve been using the b-tree structure that is one of the oldest things in Behold, around since the beginning about 10 years ago. It has withstood the test of time and I thought it was pretty efficient … until now.

So it was time to finally replace it with a hash table implementation. No matter how good my b-tree is, it still takes on the order of log(n) for each operation, where n is the number of items in the tree. But a hash table takes on the order of (1), i.e. it doesn’t matter how big the tree is, it always takes about the same amount of time.

What a hash function does is it takes a string, e.g.: INDI, and moves bits around (as efficiently as possible since you’ll do this zillions of times) to create a fairly random number from it. Then you whittle it down to a number from 1 to the size of your hash table and that is the location in the table where the information relating to the INDI tag should be. If two of your tags “collide” on the same location, then there are various techniques of handling it. The bottom line is that as long as your hash table has enough room to easily store all your entries with few collisions, then the hash table implementation will be very efficient.

There were several hash table implementations available for Delphi, but I decided on Primoz Gabrijelcic’s GPStringHash. I like his work, and he’s obviously put some thought into making it efficient as the hash function is coded in inline assembler. He’s also the author of GpProfile that I’ve used for years and he’s been quite active on Stackoverflow.

The bottom line is his hash function reduced the 1.4 million lookups from 1.4 seconds down to .35 seconds. That makes the hash 4 times faster than the b-tree, and this is for a small table of only few hundred entries! I was surprised that the improvement was that much.

I also found a way to eliminate all but .05 out of the .60 seconds of overhead in the increment procedure. Bottom line is an 80% reduction from 2.0 seconds down to .40 seconds. I still think .40 seconds is a too much time to just count 1.4 million items, but it will have to do for now.

I will be using this hash function instead of my old b-tree for most of my other internal data structures as well, which should result in additional speed improvements.