Login to participate
  
Register   Lost ID/password?

Louis Kessler’s Behold Blog

Using AI Bots for Programming - Sun, 16 Mar 2025

I’m working hard on Version 2 of Behold, and I needed to write some code to strip out HTML from a string. This would be so Behold could display the text embedded in the HTML correctly without formatting.

Since I’m a Windows and Office 360 aficionado, I get a version of Microsoft’s AI bot named Copilot built into my internet browser Edge and my Office products. So that’s the one most convenient for me to use. And I have had good luck asking Copilot programming questions to help me with small tasks in my programming language Delphi in the past.

But this time, this was a task I really didn’t want to do myself. I saw the solution as scanning through the string character by character and parsing out what was html to remove it. Not too difficult, but a bit of a pain to code. So I thought I’d let Copilot try.


Query Number 1

image

uses
  System.RegularExpressions;

function RemoveHTML(const Input: string): string;
var
  RegEx: TRegEx;
begin
  // Create a regular expression to match HTML tags
  RegEx := TRegEx.Create(’<[^>]+>’);
  // Replace all HTML tags with an empty string
  Result := RegEx.Replace(Input, ”);
end;

// Example usage:
var
  Original, Cleaned: string;
begin
  Original := ‘<html><body>Hello, <b>World</b>!</body></html>’;
  Cleaned := RemoveHTML(Original);
  WriteLn(Cleaned);  // Output will be "Hello, World!"
end.

image

Well that was a surprise to me. A very simple two line function. It uses a mathematical notation called a “regular expression” to find the HTML bits. In all my programming days, I’ve never got to understand them well as I find that once you write one of those regular expressions, such as the “<[^>]+>” that Copilot created, then it is almost impossible to reverse engineer what it does and how it works.

I was a great fan of the APL programming language, and it had the same feature of being impossible to figure out after the fact.

Nonetheless, Copilot created me some code that I could use even if I didn’t fully understand how it worked.


Query Number 2

I wanted a slight modification to that. HTML can include urls in its HREF values, and that was information I ddn’t want thrown out, so I then asked Copilot:

image

uses
  System.RegularExpressions;

function ReplaceHrefTags(const Input: string): string;
var
  RegExHref, RegExTags: TRegEx;
begin
  // Regular expression to match href tags and extract href value
  RegExHref := TRegEx.Create(’<as+[^>]*href="([^"]+)"[^>]*>’);
  // Replace href tags with just their href values
  Result := RegExHref.Replace(Input, ‘$1′);

  // Regular expression to match all other HTML tags
  RegExTags := TRegEx.Create(’<[^>]+>’);
  // Remove other HTML tags
  Result := RegExTags.Replace(Result, ”);
end;

// Example usage:
var
  Original, Processed: string;
begin
  Original := ‘<html><body>Visit <a href="http://example.com">this link</a> for more information.</body></html>’;
  Processed := ReplaceHrefTags(Original);
  WriteLn(Processed); // Output will be: "Visit http://example.com for more information."
end.

image

Excellent!  Just one more thing.


Query Number 3

image

uses
  System.RegularExpressions;

function ProcessHTML(const Input: string): string;
var
  RegExHref, RegExTags: TRegEx;
begin
  // Regular expression to match <a> tags with href attributes and capture both href value and inner text
  RegExHref := TRegEx.Create(’<as+[^>]*href="([^"]+)"[^>]*>(.*?)</a>’);
  // Replace <a> tags with "href value (inner text)"
  Result := RegExHref.Replace(Input, ‘$1 ($2)’);

  // Regular expression to match all other HTML tags
  RegExTags := TRegEx.Create(’<[^>]+>’);
  // Remove remaining HTML tags
  Result := RegExTags.Replace(Result, ”);
end;

// Example usage:
var
  Original, Processed: string;
begin
  Original := ‘<html><body>Visit <a href="http://example.com">this link</a> for more information.</body></html>’;
  Processed := ProcessHTML(Original);
  WriteLn(Processed); // Output will be: "Visit http://example.com (this link) for more information."
end.

image

I am rather impressed. I ended up with the code that I wanted after only 3 queries to Copilot. And it provided me with the code, an example, and an explanation of how it worked.

I know it would have taken me considerably more than 5 minutes to write that query myself without the use of regular expressions.

Thank you Copilot! Maybe you can’t completely translate my Russian documents for me yet, but it looks like you could be very useful to me for my programming.


Just One More Thing

While testing this code on some of my data, I found one more revision was needed:

image

uses
  System.RegularExpressions;

function ProcessHTML(const Input: string): string;
var
  RegExHref, RegExTags: TRegEx;
begin
  // Regular expression to match <a> tags with href attributes and capture both href value and inner text
  RegExHref := TRegEx.Create(’<as+[^>]*href="([^"]+)"[^>]*>(.*?)</a>’);
 
  // Replace <a> tags based on the condition
  Result := RegExHref.Replace(Input,
    function(const Match: TMatch): string
    begin
      if Match.Groups[1].Value = Match.Groups[2].Value then
        Result := Match.Groups[1].Value  // Include only href value
      else
        Result := Match.Groups[1].Value + ‘ (’ + Match.Groups[2].Value + ‘)’;  // Include href value and text
    end
  );

  // Regular expression to match and remove all other HTML tags
  RegExTags := TRegEx.Create(’<[^>]+>’);
  Result := RegExTags.Replace(Result, ”);
end;

// Example usage:
var
  Original, Processed: string;
begin
  Original := ‘<html><body>Check out <a href="http://example.com">http://example.com</a> and <a href="http://another.com">this link</a>.</body></html>’;
  Processed := ProcessHTML(Original);
  WriteLn(Processed); // Output will be: "Check out http://example.com and http://another.com (this link)."
end.

image

But this time there was a problem. The code would not compile.

Would Copilot be able to fix it?

image

uses
  System.RegularExpressions;

function ReplaceHrefTags(const Input: string): string;
var
  RegExHref, RegExTags: TRegEx;
  Matches: TMatchCollection;
  Match: TMatch;
  Processed: string;
begin
  Processed := Input;

  // Regular expression to match <a> tags with href attributes and capture both href value and inner text
  RegExHref := TRegEx.Create(’<as+[^>]*href="([^"]+)"[^>]*>(.*?)</a>’);
  Matches := RegExHref.Matches(Processed);

  // Process each <a> tag match
  for Match in Matches do
  begin
    if Match.Groups[1].Value = Match.Groups[2].Value then
      Processed := StringReplace(Processed, Match.Value, Match.Groups[1].Value, [rfReplaceAll])
    else
      Processed := StringReplace(Processed, Match.Value, Match.Groups[1].Value + ‘ (’ + Match.Groups[2].Value + ‘)’, [rfReplaceAll]);
  end;

  // Regular expression to match all other HTML tags
  RegExTags := TRegEx.Create(’<[^>]+>’);
  // Remove all remaining HTML tags
  Result := RegExTags.Replace(Processed, ”);
end;

// Example usage:
var
  Original, Processed: string;
begin
  Original := ‘<html><body>Check out <a href="http://example.com">http://example.com</a> and <a href="http://another.com">this link</a>.</body></html>’;
  Processed := ReplaceHrefTags(Original);
  WriteLn(Processed); // Output will be: "Check out http://example.com and http://another.com (this link)."
end.

image

And that worked!

Very well done Copilot.

RootsTech 2025 - Sat, 8 Mar 2025

Over the past 3 days, RootsTech took place in Salt Lake City. RootsTech is the largest annual genealogy conference in the world, with tens of thousands of in person attendees and several million online attendees.

I went to RootsTech in person three times, in 2012, 2014 and 2017.

In 2018. RootsTech started live streaming some of their talks. In 2021 due to Covid, the Conference was completely online. They resumed the in-person conference a couple of years ago, but with millions of viewers online during the Covid years, they smartly decided to continue streaming many of the talks.

If you went to the RootsTech website this year between March 6 and 8, you would have seen the list of 418 sessions. 254 of those were in person, 244 were available online. And 80 of the in person sessions were also available online and live. With so many sessions available online, it made for quite a selection of talks to choose from.

image


Attending Online

Of course by not attending in person, you will not be able to spend time with all your friends you made at previous conferences or make new friends. But if you couldn’t go, then the online experience is the next best thing.

There are a lot of sessions broadcast live, including the keynote sessions.

When attending online, you can browse through the list of sessions and decide which ones you want to “attend”. You can watch the live sessions when they are broadcast, or watch them any time afterwards. If you watch them live, you can join the live chat with other viewers. Most sessions also have a syllabus that you can download. 


Choosing Your Sessions

It’s really up to you to choose the sessions that interest you the most or have speakers you enjoy listening to. Most of the sessions are an hour each, and the live ones end with about 10 minutes of Q&A from the live audience.

Over the past few years and this year as well, I’ve picked about a dozen of the sessions of most interest to me. This included some of the keynotes as well as some of my favorite speakers such as Adina Newman (DNA and Endogamy), Steve Little (Artificial Intelligence) and CeCe Moore (Ethical Standards and Investigative Genealogy). Watching about 5 hours on Thursday, 5 on Friday and 2 on Saturday was enough to exhaust me as if I was there in person.

I very much enjoyed and was most impressed and surprised by CeCe Moore’s presentation:

image

She wonderfully explains her involvement for the past 11 years and gives a behind-the-scenes description of the PBS genealogy show Find Your Roots hosted by Dr. Henry Gates. But then to my surprise, she tells of how they turned the tables and did research with Dr. Gates as the subject. It included a sneak preview of the episode that is to be broadcast on April 8 where CeCe sits in the presenter’s chair. I can’t wait to see the full episode.


Session Archives

Don’t feel you can only watch the RootsTech sessions during the few days when the RootsTech Conference is taking place each year. RootsTech keeps an online archive of most of the sessions back to 2019 and these can be watched any time. There are more than 1500 sessions and 3000 other recordings in their catalog to watch.

image

Continuing Education 2024 - Wed, 1 Jan 2025

A few years ago, the Association of Professional Genealogists  @APGgenealogy started requiring that members report at least 12 hours of Continuing Education each year. I found the task of listing my CE time quite interesting and have posted them each year.

Below is my Continuing Education activity list for 2024. Each event was 1 hour unless otherwise noted.

Webinars – Total 28 hours

  • Jan 3 – Artificial Intelligence and Genealogy: New Beginnings in 2024 – Steve Little, Legacy Family Tree Webinars
  • Jan 17 – Resolving Conflicting Birthdates and DNA Painter – Lisa Cooke, Genealogy Gems Podcast (30 min)
  • Feb 13 – MyHeritage Knowledge Base – James Tanner, Family Tree Webinars
  • Feb 13 – Meet MyHeritage’s Global Experts – Daniel Horowitz, FamilySearch Facebook Live
  • Feb 14 – AI and Genealogy – Thomas MacEntee, Family Tree Webinars
  • Feb 21 – Genealogy Smart Start – Elizabeth Gomoll, Family Tree Webinars
  • Feb 26 – Learn how to create a digital genealogy workflow. Janine Adams – MyHeritage Live from Mar 30, 2023.
  • Mar 17 – Sorting Your DNA Match Lists - Kelli Bergheimer, webinar for the Allen County Public Library
  • Mar 20 – 7 Tips for Using AI as Your Genealogy Assistant – Dana Leeds, Family Tree Webinars
  • Mar 22 – Finding the Generation of Connection – Diahan Southard – Your DNA Guide
  • Apr 9 – Ask the Experts with Genetic Genealogist “The DNA Geek” Leah Larkin – Diahan Southard, Your DNA Guide (30 min)
  • Apr 11 - A Guide to Cemetery Photography – Judy Russell, Family Tree Webiinars.
  • Apr 12 – 6 Tips to Build a Family Tree for Busy People – Tara Shymanski, Family Tree Webinars
  • Apr 26 – DNA for Genealogy – An Evolving Journey – Jennifer Zinck, Association for Professional Genealogists (90 minutes)
  • May 1 – Strategies to Analyze Endogamous DNA – Alec Ferretti, Family Tree Webinars
  • May 5 – Advance Your Jewish Research in the Former Russian Empire with Important Online Resources and Strategies – Joe Everett - Santa Cruz Jewish Genealogy Society (120 min)
  • May 22 – A Tour of Goldie May’s Genealogy Power Tools – Richard Miller – Legacy Family Tree Webinars
  • June 6 – Using AI to Analyze and Visualize Your Census Data – Dana Leeds – YouTube (20 min)
  • July 9 – 10 Reasons Why MyHeritage is a Genealogist’s Best Friend – Melissa Barker, Family Tree Webinars
  • Aug 8 – Preparing for 3rd Party DNA Tools by Sara Cochran – Virtual Genealogical Association (Live presentation was June 30)
  • Sep 18 – Practical Chromosome Mapping: Gaining Insights from Segments of DNA – Jonny Perl, Family Tree Webinars
  • Oct 19 – Rubik’s Cube Genealogy: A New Twist on Your Old Data – Elissa Scalise Powell, Family Tree Webinars
  • Oct 19 – Keeping Track of Surnames – Fiona Brooker, Family Tree Webinars (20 min)
  • Nov 12 – Turn Your MyHeritage Record Matches into a Research Roadmap – Lisa Alzo, Family Tree Webinars
  • Nov 20 – Top Ten AI Genealogy Breakthroughs of 2024 – Steve Little, Family Tree Webinars (90 min)
  • Nov 22 – Information Overload? Effective Project Planning, Research, Data Management & Analysis – Elizabeth Shown Mills, Family Tree Webinars (80 min)
  • Dec 4 – Challenging Some DNA Myths: Learn to Separate DNA Fact from Fiction – Paula Williams, Family Tree Webinars (80 min)
  • Dec 17 – I (Think I) Have a DNA Network: Now What? - Diahan Southard, Family Tree Webinars

Conferences (Online) – Total 13 hours

  1. Feb 29 to Mar 2 – RootsTech 2024
    • Genetic Genealogy Turns 25: DNA Experts Tell Their Stories – Diahan Southard
    • What’s New at Ancestry in 2024 – Crista Cowan
    • The Knowles Collection: What Is It and How Do I Use It – W. Todd Knowles (30 min)
    • Drowning in Family Photos: Why Metadata Makes Your Library Searchable and Enjoyable – Rich Harrington
    • U.S. Synagogue Records as Genealogical Resources – Ellen Kowitt
    • Using FamilySearch for Jewish Research – W. Todd Knowles
    • FamilySearch GEDCOM 7: Popularity, Resources and Future – Gordon Clarke (30 min)
    • Syncing AQ/PAF data with Family Tree (3 videos) – Gaylon Findlay, Ancestral Quest
    • What’s in a name? DNA, surnames and one-name studies – Debbie Kennett
  2. Nov 1 to Nov 3 – WikiTree Symposium 2024
    • Creating an Ancestor Sketch – Thomas MacEntee
    • Finding the records for “impossible” genealogy. Lessons learned from a Chinese genealogist – Linda Yip
    • Hopping Down the Paper Trail – Sara Cochran
    • Using AI in Family History Research – Melanie McComb
    • From Name Collection to Name Confirmation: Adding Leaves tot he RIght Branch – Cheri Hudson Passey

    In total, my time in 2024 was 41 hours. By comparison, 2023 was 39.5 hours and 2022 was 38 hours.