Ancestors and Algorithms: AI for Genealogy

Ep. 29: How to Find Ancestors in Historical Newspapers Using AI

Brian Season 1 Episode 29

Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.

0:00 | 33:36

Have you ever searched for an ancestor in a newspaper database and found nothing, even though you were certain the information had to be there? You are not searching wrong. You are searching with the wrong strategy. And in this episode, that changes.

Episode 29 of Ancestors and Algorithms is a full AI tool showdown: Perplexity vs. Claude, head to head on the same newspaper research challenge. Same ancestor. Same mystery. Two completely different jobs. By the end of this episode you will know exactly which tool to reach for at every stage of your newspaper research, and you will have three copy-paste ready AI prompts that work on completely free databases like Chronicling America and Fulton History.

Here is what we cover: how to use Perplexity AI to build a newspaper research strategy before you ever open a database — including how to find ethnic-language newspapers, Polish-language newspapers, German-language newspapers, and immigrant community papers that English-language archives completely overlook. Then how to use Claude AI to fix garbled OCR text in digitized newspaper scans, extract hidden genealogical facts from historical obituaries, and apply the cluster research method to find ancestors who almost never appear in direct name searches.

The case study follows a Polish immigrant ancestor in Luzerne County, Pennsylvania in the 1880s through 1914. After two years of failed searches, an unreadable OCR obituary transcript led to four new research directions — an immigration year, a previously unknown Pennsylvania city connection, a church affiliation that opens parish records, and a census discrepancy pointing to an undiscovered child death record.

Topics and search terms covered in this episode include: how to search Chronicling America effectively, how to fix OCR errors in old newspaper scans, how to find an ancestor's obituary online for free, how to use AI for genealogy research, Perplexity AI genealogy prompts, Claude AI for document analysis, historical newspaper research tips, how to break through a genealogy brick wall, immigrant ancestor research strategies, Polish genealogy research, genealogy research for women, cluster research genealogy, FAN club genealogy method, Newspapers.com alternatives, GenealogyBank vs Chronicling America, Genealogical Proof Standard, free genealogy tools, family history research with AI, and how to read old handwriting in genealogy documents.

Whether you are searching Ancestry, FamilySearch, Newspapers.com, GenealogyBank, or free archives, the AI techniques in this episode work across every platform. No paid subscriptions required to get started. This episode is for beginner and intermediate genealogists, family history researchers, or anyone tracing immigrant ancestors, solving brick walls, or getting more from digitized historical newspaper collections.

Visit ancestorsandai.com for show notes, transcripts, prompts, and the Companion Guide.

Connect with Ancestors and Algorithms:

📧 Email: ancestorsandai@gmail.com
🌐 Website: https://ancestorsandai.com/
📘 Facebook Group: Ancestors and Algorithms: AI for Genealogy - www.facebook.com/groups/ancestorsandalgorithms/

Golden Rule Reminder: AI is your research assistant, not your researcher.

Join our Facebook group to share your AI genealogy breakthroughs, ask questions, and connect with fellow family historians who are embracing the future of genealogy research!

New episodes every Tuesday. Subscribe so you never miss the latest AI tools and techniques for family history research.




Picture this. You've been searching for your ancestor for years. Census records, church records, vital records. Nothing you do gets past a certain brick wall. And then someone in your Facebook group says, did you check newspapers? Newspapers. You know they exist. You've probably even tried searching one or two. But here's the thing. Most genealogists are searching newspapers completely wronged. And when you add AI to the mix, the results change dramatically. I found an obituary this week that I had completely missed in two years of searching. The text was nearly unreadable, but AI helped me read it. And what it told me about my ancestor's life sent me in four new research directions I had never considered. Today, we're pitting two AI tools head-to-head on the exact same newspaper research challenge. Same ancestor. different tools, very different results. And I'm going to show you which one to reach for and exactly how to use it. Let's dig in. Welcome to Ancestors and Algorithms, where family history meets artificial intelligence. I'm your host Brian, and today we're tackling a topic I've been getting requests about practically since the first episode. Newspapers. Specifically, how do you use AI to find your ancestors in historical newspapers, make sense of what you find, and connect it back to your broader research? 

Now, whether you're a complete beginner who has never opened a newspaper database, or someone who's been searching for years and keeps coming up empty, this episode is for you. Hey, guys, you have a great time for me. We're starting from scratch, building up to some genuinely powerful techniques, and I promise you're going to walk away with at least one approach you've never tried before. Let's get into it. 

Okay, I want to start with something honest, because it is the kind of thing I think we all need to hear. For most of my research life, newspapers were that resource I kept meaning to use more seriously, but never quite did. I would search for a name, come up empty, or come up with hundreds of completely irrelevant hits, and then move on. told myself newspapers were a long shot, too unpredictable, too hard to search. Sound familiar? Here's what I eventually figured out, and it completely changed how I approached newspaper research. The problem was not the databases. The problem was that I was walking into a newspaper search the same way I walk into a census search. Name, date range, location, go. And that works reasonably well for structured records where the information is organized in predictable fields. But newspapers are not structured records. Newspapers are noise. Extraordinary, valuable, one-of-a-kind, noise. And you cannot approach noise with a structured record mindset. Think about the kinds of things you might find about an ancestor in a newspaper that you would never find anywhere else. Not just obituaries, though those can be incredible. Think about a small notice in the legal section when they filed a land claim. A blurb in a social column when they visited a cousin three towns over. A letter to the editor. A mention in a business advertisement. A court notice. A report on a fire that destroyed their property. A school honor roll with a grandchild's name. None of those things show up when you search Margaret Hoffmeister, 1878. You find them when you understand what kinds of things people wrote about in their community. And you build your search strategy around the realistic ways your ancestor might have appeared in print. This is where AI changes everything. Because building a smarter search strategy is exactly what AI is very, very good at. Now, before we get into the tools, I want to address something head-on. Because I know it's going to come up. Newspaper archives cost money. Some of them, anyway. And I want to be completely upfront with you about what is free and what is not before we go any further. Here's the landscape. There are three tiers here. The first tier is completely free. Chronicling America, hosted by the Library of Congress, is the crown jewel of free newspaper research. It was just upgraded in August 2025 with a whole new interface. We're talking over 23 million newspaper pages from all 50 states, covering 1756 through 1963. Zero cost. No account required. It is also fully text searchable, which matters enormously when we get to our AI strategies. Also, in the free tier, Fulton History at FultonHistory.com. That's F-U-L-T-O-N-H-I-S-T-O-R-Y.com. This is a remarkable project run by one man, Tom Tryniski, a retired engineer who has personally scanned over 57 million newspaper pages. It started as the New York collection as it expanded significantly. Free to use, though you do need to register for an account. If your ancestors lived in New York, this is an absolute must. And then there are state digital newspaper archives. Many states have their own free collections, often hosted through the state library or historical society. It is always worth checking your specific state before you pull out your credit card. The second tier is free through your library. Sites like Ancestry and Newspapers.com offer institutional access through many public libraries. If your library card gets you remote access to their digital collections, you may already have access to Newspapers.com without knowing it. Check your library's website before you pay for anything. The third tier is paid. Newspapers.com, which is owned by Ancestry, offers two levels. The basic plan runs about $8 a month and it covers the older content. The publisher Extra plan is about $20 a month and it has everything published after that, running up to recent years for many titles. Genealogy Bank is another paid option and it's worth knowing about because 95% of their collection is exclusive to them. You will not find those titles anywhere else. They run about $8 to $10 a month on an annual plan. Find My Past is one I mentioned for listeners in the UK and Australia because it has an enormous collection of British and Irish newspapers that you simply cannot get elsewhere. I'm And my heritage also has a connection called oldnews.com. I'm not completely familiar with their pricing plan. Now, here's the really important thing I want you to hear. Everything I'm going to teach you today works on free databases just as well as it works on paid ones. The AI techniques we're covering are about how you search and how you interpret what you find. Whether you're searching Chronicling America for free or Newspapers.com on a paid subscription, you're using the same AI strategies. The tool does not change. The skill does not change. Only the collection you are searching changes. So, no matter where you are on the budget spectrum today, this episode applies to you. And remember what we always say around here. AI is your research assistant, not your researcher. We're going to use AI to be dramatically smarter about how we approach newspaper research. But the actual searching, the evaluation of what we find, the verification against other sources, that part is still on us. That is still what makes this genealogy and not just a data exercise. All right. Let's talk about how this showdown is going to work. I've set up a challenge. I'm researching a fictional ancestor. Let me introduce you to her. Her name is Louisa Brennan Kowalczyk. Born around 1861, probably in Pennsylvania. Her family was Polish immigrants. She married a German-American named Heinrich Kowalczyk in the mid-1880s. They had six children. She died sometime around 1914. I have the basic framework of her life from census reports and one death certificate. But I want to know more about who she actually was. The challenge? Using only AI assistants. Build a smarter newspaper research strategy for Louisa. And then show how to make the most of whatever we find. Two tools. Same challenge. Let's see what happens. 

Let's start with Perplexity. If you've not used Perplexity before, here's the quick version. It is a search engine with AI on top. When you ask it a question, it does not just return links. It searches the web, synthesizes the results, and gives you a cited answer. That citation piece is critical for genealogy because we need to know where information is coming from. The free version of Perplexity gives you unlimited basic searches. That is genuinely unlimited. And it gives you a limited number of what they call pro searches each day, which use a more powerful research mode. For today's newspaper research task, the free version handles most of what we need. The pro plan is $20 a month if you want to remove the limits on those deeper searches. Now, here's the first wow moment of this episode. And I mean it because this one reframed how I think about newspaper research entirely. Before I ever open a newspaper database, I use Perplexity to answer a question that most genealogists skip entirely. What newspapers actually existed in the place and time my ancestor lived? Think about why this matters. If you are searching Chronicling America or Newspapers.com without knowing what titles covered your ancestor's community, you might be searching papers that never wrote about your ancestor's town at all. You might be missing the one small weekly paper that was the heart of their community. You could search for years and come up empty not because the information does not exist, but because you were looking in the wrong newspaper. AI fixes that and it fixes it fast. Here's the exact prompt I used for Louisa. You can copy this directly. Quote, I am researching an ancestor who lived in Luzerne County, Pennsylvania between 1880 and 1914. She was from a Polish immigrant community. Please research which historical newspapers were published in Luzerne County during this period. Include the names of the newspaper with ethnic or immigrant community focus, the years they were published, which ones have been digitized, and where those digitized versions can be accessed for free or through subscription, end quote. Now, what came back was genuinely impressive. Perplexity identified the Wilkes-Barre Records, the Wilkes-Barre Times, and the Wyoming Valley Daily News as the major papers of that era. But then it flagged something I would not have known to look for. There were several Polish-language newspapers published in Pennsylvania during the late 1800s and early 1900s, including the Gwiazda out of Philadelphia and the Narod Polski published by the Polish-Roman Catholic Union. For a Polish immigrant community like Luisa's, these ethnic papers would have covered weddings, deaths, church events, and community news that the English-language papers would never have printed. Perplexity also told me which of these titles had been digitized and where to find them. Some are on Chronicling America. Some are only accessible through Genealogy Bank. And a few are on microfilm at specific Pennsylvania archives that have not been digitized at all. That is a research roadmap built in about two minutes. Now, here's the second prompt. This is the one that gets at the GPS work. This is about building what professional genealogists call a reasonably exhaustive search. I followed up with this. Quote, Based on the newspapers you identified, help me build a research strategy for finding mentions of a woman named Louisa Kowalczyk, who lived in Luzerne County, Pennsylvania, around 1880 to 1914. She was the wife of a German-American man and had six children. Please suggest, what kinds of notices and articles would typically appear in these papers about an ordinary working-class woman of that era? What search terms and name variations I should try and which time windows are most likely to have coverage of her life events? End quote. This is where it gets interesting. Because perplexity did not just give me try her name. It gave me a layered strategy. For the English-language papers, it suggested searching for legal notices, which would have appeared when Heinrich conducted any property transactions. Marriage announcements and social columns, which in that era often included the bride's family background. Obituaries for both Louisa herself and potentially her children and parents, since obituaries in that period often named surviving family members extensively. Church news, particularly around confirmations and first communions for her children, which local papers regularly printed. And any reports of accidents, fires, or crimes in their neighborhood. For the Polish-language papers, it's like something I never would have thought of. In Polish immigrant communities of that era, it was common practice to publish notices when a family received a letter from relatives in Poland, or when someone returned from a visit to the old country. Those notices are genealogical gold. They can tell you where in Poland the family originated from, information that can be nearly impossible to find any other way. And on the name variations front, perplexity reminded me that Polish surnames were frequently anglicized in American newspapers. Kowalski might appear as Kowalczyk, Kowalczyk, Kowalski, or simply Kowalski with a Y. And Luisa's maiden name, Brennan, suggests possible Irish heritage on one side, which adds another layer of community connections to explore. Total time to build this research strategy? Maybe 10 minutes with the two prompts and reading through the results. Without AI, this kind of background research would have taken me hours of digging through county histories and newspaper directories. And here's the thing. This is all GPS, genealogical proof standard, element one in action. We are not just randomly searching. We're building a plan for a reasonably exhaustive search before we ever type a name into a database. That is what professional genealogists do. AI just makes it faster. 

Alright, now we get to the part where Claude comes in. Let me set the scene. I use the strategy Perplexity built and I open Chronicling America. I search for Luisa Kowalski, then Luisa Kowalczyk, then Mrs. Heinrich Kowalski. I try Genealogy Bank next with three more variations. Each search, nothing directly about Luisa. Here's what most researchers do at this point. Close the tab and move on. Here's what I did instead. I searched for Heinrich Kowalski, Luisa's husband. And I found a small notice in the 1901 Wilkes-Barre record about a property transfer. Two lines, Heinrich Kowalski selling a lot on Cary Avenue to a man named Stanislaw Wojciechowski. Now, on its own, that is mildly interesting. But what do I do with two lines about a property sale? I brought it to Claude with this prompt. Quote, This is a brief property notice from a 1901 Pennsylvania newspaper. The individuals mentioned are Heinrich Kowalski, the seller, and Stanislaw Wojciechowski, the buyer. Based on your knowledge of Polish-American immigrant communities in early 20th century Pennsylvania, what can you tell me about the likely relationship between these two men, What the timing and location of the sale might suggest about the family's circumstances, and what additional records this transaction might have generated that I should search, end quote. What came back shifted my whole direction. Claude explained that Cary Avenue in Wilkes-Barre was in a dense Polish immigrant neighborhood. A property sale between two Polish surnames in 1901 often signaled a family in transition, possibly financial pressure from mine industry instability, possibly preparing to move, possibly that Heinrich's health was failing and he was settling his affairs. And it flagged something I had not thought of. Wachowski is a distinctive enough surname that searching that name in newspapers might surface references to the Kowalczyks indirectly. Since Polish immigrant families in that era appeared together repeatedly in church news, organizational notices, and social columns. That is the cluster research principle, and it led me somewhere. A search for Wachowski turned up a church notice from 1903. The Wachowski family was listed as godparents at the baptism of Kowalski's youngest child. That is a documented relationship between these families. That is a picture. Still no obituary for Louisa herself. But I am building a picture, and I am getting closer to the dates when one might exist. I went back to chronicling America and searched a narrower date window. Perplexity strategy had suggested that Louisa likely died around 1914 based on the death certificate I had. I searched specifically in 1914 in the Wilkes-Barre papers using Kowalski without a first name. And I found it. An obituary. The Wilkes-Barre Times leader, April 1914, Mrs. Louisa Kowalski. The problem? The OCR was a disaster. The text was barely readable. If you've ever stared at a garbled OCR transcript wondering whether it is even worth trying to work with, this is the moment where AI changes everything. Here's the prompt I used. Quote, I found this obituary in a 1914 Pennsylvania newspaper. The text has been poorly transcribed by OCR software and contains significant errors. Please do the following. First, reconstruct the most likely original text flagging any words you are uncertain about with brackets and a confidence level. Second, identify every person named and their relationship to the deceased. Third, list every specific genealogical fact that obituary contains, including any that are implied rather than directly stated. Here's the raw text. End quote. Then I pasted in the mess of garbled letters. What Claude returned was remarkable. It reconstructed the full obituary with confidence levels for each uncertain word. And those words came to this country in 1882 as genealogical gold that I did not have before. A specific immigration year. That opens up ship manifests, naturalization records, possibly port of entry records. But here's what I love most about using Claude for obituary analysis. It is not just about fixing the garbled words. It is about making sure you do not miss anything in the rush of excitement over finding the document. I used a second prompt. Quote, based on this obituary, please create, a list of every follow-up research tasks it suggests, ranked from most to least likely to produce new genealogical information. Include specific record types, repositories, and the genealogical questions each would answer. End quote. Claude came back with eight research leads I had not written down yet. The immigration year, suggesting a specific ship manifest search. The Hazleton connection, suggesting earlier Pennsylvania records before Wilkes-Barre. The five versus six children discrepancy, suggesting I review my census records more carefully. A mention of St. Stanislaus Church in the obituary, pointing to parish records. The phrase, after an illness of several weeks, suggesting I look for hospital or physician records. A surviving husband and children, suggesting extended research into their records, where Louisa might appear as a named parent. Now, before I got carried away, I did something essential. I went back to the original scanned image in Chronicling America, and I verified Claude's reconstruction word by word for every bracketed term. Some I could confirm. A few I could not fully verify because the scan was too degraded. Those I noted in my research log is uncertain. That is not a failure. That is good research. AI is your research assistant, not your researcher. Claude helped me read something I could not read alone, and gave me a roadmap for what to do next. But the verification? The judgment call about what the evidence actually proves? That is still mine to make. One more technique before we move on, because it applies to almost everyone who searches newspapers. Sometimes you search, and instead of nothing, you get 30 or 40 results. And the thought of reading through all that garbled OCR text to find the three that might actually be relevant is discouraging enough to make you stop. Here's how to handle that with Claude. Quote, I'm going to paste text from multiple newspaper search results. Some may mention my ancestor. Others are false positives from OCR errors or coincidental name similarities. Please sort these by likelihood of relevance to my research target, flag any that seem to be about a different person with the same name, and identify anywhere the text is too garbled to evaluate without checking the original image. My ancestor is name, location, approximate years, and any distinguishing details, end quote. Then paste in the raw text from your results. Claude gives you a ranked triage list. You still do the actual reviewing, but you do it in a fraction of the time, starting with the most promising results instead of wading through everything in order. 

So let's talk about the showdown verdict, because I promised you a comparison. Perplexity versus Claude for newspaper research. Which one wins? The honest answer is it's not a competition. They are doing different jobs, and you need both. Perplexity wins the discovery phase. When you need to know what newspapers existed, where they are digitized, how to access them, what search strategies to build, and what historical context explains the community you are researching. Perplexity with its real-time web search capabilities doing something Claude fundamentally cannot do. It is going out to the internet right now and finding current information about what collections exist and where. Claude wins the analysis phase. Once you've found something, once you have a document in hand, whether it is a garbled OCR obituary or a two-line property notice, you want Claude. Claude strength is making sense of what you've found, extracting every possible genealogical value, flagging what needs verification, and generating the follow-up research cue that keeps your investigation moving. Here's the workflow I now use for every newspaper research session. Step 1, perplexity. What newspapers covered my ancestors' communities and time period? Which are digitized? Which are free? What search terms and name variations should I try? Step 2, search. Systematically using the strategy, perplexity built, not just the name. Step 3, Claude. When I find something, anything worth examining, I bring it to Claude for analysis. The more context I give it, the better the analysis. Step 4, verify. Everything Claude interprets or reconstructs gets checked against the original source image. Non-negotiable. And the payoff of this workflow with Louisa? Let me tell you what the obituary actually gave me. An immigration year I did not have. 1882. A previous connection to Hazleton, Pennsylvania that I had not known about. A church affiliation. St. Stanislaus that opens up parish records. A family-size discrepancy. Five children in the obituary versus six in the census? That tells me I probably have a child who died young and whose death record I have not found yet. And a window for the illness before her death that suggests I should look for hospital records in Luzerne County. Four new research directions from one garbled obituary that AI helped me read. I also want to address something directly because it comes up a lot when I talk about AI-assisted newspaper research. People sometimes ask whether using AI to decode a garbled transcript is somehow less rigorous than doing it yourself. My answer? Absolutely not. As long as you verify. Claude made the garbled text legible. I went back to the original scan and confirmed its reconstruction word by word. That is exactly the standard professional genealogists use for any transcription. The tool that helped you read it does not diminish the quality of the final citation. What matters is that you verified against the original source. 

Okay, let's pull this together because I want you to be able to take this and use it today. Here's the rapid-fire recap of what we covered. Before you ever open a newspaper database, use perplexity to do your discovery work. Ask it what newspapers existed in your ancestors' community, which are digitized, where to find them free or via subscription, and what kinds of notices those papers typically ran for people like your ancestors. Free Tear handles this completely. Once you're searching, think beyond the name search. Use the keyword and context strategies AI helps you build. And when you hit a wall, try the people around your ancestors. The cluster research pivot, searching for neighbors and associates, is how I found the thread that eventually led to Louise's obituary. When you find something, bring it to Claude. A garbled OCR transcript does not have to stop you. Claude can reconstruct with the OCR mangled, extract every genealogical fact in the document, and generate a research cue of what to look for next. That obituary gave me four new research directions I had not seen in two years of working on this family. And always verify. Every reconstruction Claude makes, every interpretation, gets checked against the original source image. That is not a disclaimer. That is genealogy. Now, for your homework assignment this week. Pick one ancestor you've been stuck on. Open Perplexity and type this exact prompt. Quote, I am researching an ancestor who lived in, county and state, Alabama. Between, year range, please tell me what historical newspapers serve that community, which are digitized, and where I can access them, and what kinds of notices these papers typically publish about ordinary residents, end quote. That's it. One prompt. See what you get back. Then, try chronicling America first with whatever titles Perplexity identifies, if they fall in its collection. It's free, and it's just upgraded in August 2025 with a cleaner interface. And if you find something, even a small something, bring it to Claude. Ask it, quote, what is the maximum genealogical value I can extract from this document? Who is mentioned? What events does it document? What follow-up research does it suggest, end quote? Come tell us what you found. Head over to ancestorsandai.com, where you'll find the link to our private Facebook community. I genuinely want to hear about your discoveries. Now, I mentioned at the top that we covered a lot of ground today. The three basic prompts I showed you, those are a solid foundation. But, newspaper research has a lot of depth. There are advanced techniques around searching ethnic language newspapers, using Perplexity's deep research mode, 

By using cloud for multi-document comparison when you find multiple articles about the same event. We go deep on all of that in the companion guide for this episode. The guide includes 12 advanced prompts built specifically for newspaper research, covering everything from legal notice mining to social column analysis to building a GPS-compliant citation for newspaper resources. you want the full professional toolkit for AI-powered newspaper research, that is where to find it. But everything you need to get started is right here in today's episode. The free content stands on its own. Next week, we're going back to basics in the best possible way. Episode 30 is one I've been building toward for a while. We're talking about the genealogical proof standard, all five elements, and how AI helps you meet each one. This is the episode that separates good genealogy from great genealogy. And I'm going to show you how AI tools map to each element in a way that I've never seen explained anywhere else. You do not want to miss this one. Thank you so much for listening to Ancestors and Algorithms. If today's episode was useful, please leave a review wherever you listen to podcasts. And if you know a fellow genealogist or a family history researcher who's been avoiding newspaper research because it feels too complicated, share this episode with them. That is genuinely the best way to help our community grow. I'm your host, Brian, and I will see you next week for another journey into the past powered by the future. Until then, happy researching.