What family archives are for, now that the AI can read them
I have three collections of letters sitting in my house.
The oldest is a stack from my great-grandparents, written between 1910 and 1913. The second is from my grandparents, written during World War II. The third is from my father — letters he wrote from the moment I was born until the moment his mother died.
There are other things too. Memoirs. Loose documents. A family record that shows up in more places than I can keep track of. But those three collections are the weight of it. Three generations, each one writing to the next, each one leaving behind something I’ve never fully read.
I’ve always known the letters were there. What I didn’t have, until very recently, was a real way to sit with them.
If you’ve ever inherited a box of family paper, you know the problem. A few hundred pages of handwriting is not something you read casually on a Saturday. It’s a project. Genealogists charge by the hour for transcription. Archival services exist, but they’re priced for institutions. Cloud AI services can do it, but I’m not putting my grandparents’ wartime letters into someone else’s data center.
So for most of my life, the answer has been: someday. Someday I’ll retire and sit at the kitchen table with a magnifying glass. Someday I’ll scan them properly. Someday I’ll find the time.
Something shifted in the last year, and I want to write about it while it’s still new.
I can now read the entire archive at my own desk, on hardware I already own, with no one else in the loop. Not “kind of read it” — actually transcribe it, actually analyze it, actually turn a thousand pages of cursive into something I can search, annotate, and print into a book. In hours. On a machine that’s been sitting in my office for years doing other work.
That’s a different kind of someday.
I built a small pipeline to do this. Three stages, all running locally. Stage one reads the scanned pages and transcribes the handwriting. Stage two goes through the transcribed text and pulls out what it’s about — who, when, where, what’s being discussed. Stage three takes everything and typesets it into a keepsake PDF I can actually hold.
No cloud calls. No API bills. No family letters leaving the house.
The scale I’m working with is somewhere between four and six hundred pages of fully processed material so far, on a path toward roughly fifteen hundred once the rest of the corpus comes through. The transcription work that a human would have measured in months is measured in hours on my desk. The out-of-pocket cost, once the hardware is already there, is close to zero.
I want to be careful about the tone here. This isn’t a story about how impressive the technology is. It’s a story about what the technology is finally letting me do.
Here’s how I think about the motivation. Most of it is selfish, in the honest sense. I want to study my own personal family history, understand it better, understand the people that came before me, and also adopt best practices from what my ancestors have done. That’s the core of it. I want to know these people. I want to learn from them.
The line I keep coming back to is this: I want to use this technology to have an opportunity in my lifetime to dig into all the material, understand it, analyze it, and hopefully gain some wisdom from it, and then also to share it with future generations to have it preserved and available for future generations.
In my lifetime. That’s the part that’s new. Not “eventually.” Not “if I’m lucky in retirement.” Now.
This thread is part of why Tractor and Silo exists in the first place.
Tractor and Silo is the software I’m building for life tracking — a way to capture experiences, people, and places so the record of your own life is actually readable later. My exposure to the family archive is a big part of how it got designed. Watching what my ancestors left behind — and what they didn’t, and what I couldn’t easily get to — shaped a lot of my thinking about how to make a personal record that holds up over decades and hands. My ancestors had paper, ink, and the post. I have a database and a phone. The instinct is the same one; what changed is the tools.
The archive in my office and the app on my phone are the same project, separated by a century. That parallel isn’t something I stumbled on halfway through the transcription. It’s something I had in mind when I started building the software in the first place.
This kind of work is starting to show up at institutional scale too. Last year, Imperial War Museums announced a project with Capgemini and Google Cloud that used Google’s Gemini models to transcribe 20,000 hours of oral history from veterans and civilians — roughly 8,000 interviews recorded between 1945 and the early 2000s. Work that would have taken a team 22 years by hand, done in weeks.
That’s the shape of what’s happening. Not a gimmick. A real change in who gets to open these kinds of archives and when. The news for me, sitting here, is that the same shape of work has come down to a scale an individual can run at home.
As part of my deep focus on my family history and the archive of all these artifacts from my family history, I now am leveraging these powerful tools to play around in this archive, research it, mine it, and analyze it, and then share it with others for deeper understanding and for the value that is within this archive.
The “share it with others” part matters to me. Some of what I’m finding is for me alone. Some of it is for my kids. Some of it is for cousins and relatives who never got to read any of this. The archive has been sitting in one house, in one set of boxes, effectively inaccessible to anyone who might care about it. The goal is to change that. Preserved, readable, and in more than one set of hands.
My personal mantra for writing has always been that I want to write about what it’s like living in the world at this time. This seems like a perfect case study of that. Living through this technology revolution where artificial intelligence is empowering us to do all kinds of incredible things that we could not have done before.
A box of letters from 1910 is a record of what it was like living in the world at that time. The fact that I can finally read them is a record of what it’s like living in the world now. Both are worth keeping.
In the next post , I’ll walk through how I run all of this on a Mac Studio that isn’t new anymore.