Privacy Please
Welcome to "Privacy Please," a podcast for anyone who wants to know more about data privacy and security. Join your hosts Cam and Gabe as they talk to experts, academics, authors, and activists to break down complex privacy topics in a way that's easy to understand.
In today's connected world, our personal information is constantly being collected, analyzed, and sometimes exploited. We believe everyone has a right to understand how their data is being used and what they can do to protect their privacy.
Please subscribe and help us reach more people!
This podcast is part of The Problem Lounge network — conversations about the problems shaping our world, from digital privacy to everyday life.
Privacy Please
S7, E268 - AI Can Unmask Your Anonymous Account for $4 | Here's How
Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.
Your anonymous account isn't anonymous anymore. Researchers just proved it costs $4 to find out who you are.
In February 2026, a team from ETH Zurich and Anthropic published a paper that quietly ended the era of practical online anonymity. Their AI pipeline, using nothing but your posts, comments, and forum activity, correctly identified 67% of pseudonymous users from a pool of 89,000 candidates. No name. No photo. No metadata. Just your words.
This episode breaks down exactly how it works, why it's different from every deanonymization scare before it, who's most at risk, and what you can actually do about it.
In this episode:
- How the ESRC pipeline (Extract, Search, Reason, Calibrate) works
- Why previous anonymity attacks required structured data, and this one doesn't
- Why commercial AI safety guardrails didn't stop it
- What "practical obscurity" meant, and why it's gone
- Concrete steps to reduce your exposure today
Links:
- Research paper: arxiv.org/abs/2602.16800
- Delete your Reddit history: redact.dev
- Tor Project: torproject.org
- Signal: signal.org
Privacy Please is part of The Problem Lounge network. 🌐 theproblemlounge.com 🎙️ Subscribe on Apple Podcasts, Spotify, or wherever you listen
Anonymous Online No Longer Means Safe
SPEAKER_00You have a Reddit account. Maybe it's where you vent about your job, ask questions you've never asked out loud, talk about something you're going through. You picked a username that means nothing. You never posted your name. You thought that should be enough. It's not anymore. This week, researchers published a paper that changes the math on online anonymity permanently. They built a system using AI that can figure out who you are from your posts alone. No name, no photo, no metadata, just your words. And it cost just$4. The idea of pseudonymity online is almost as old as the internet itself. You create a handle, you separate it from your real life, and you get to exist in two worlds at once. The assumption was simple. As long as nobody connects the dots, you're safe. That assumption has a name. Researchers call it practical obscurity. The idea that your data might technically be out there, but connecting it to you would take so much time and effort that no realistic adversary would bother. It was never a perfect protection. Skilled investigators, journalists, law enforcement, obsessive ex-partners have always been able to denonymize people through manual work. Cross-referencing a writing quirk here, a location mentioned there, a niche hobby that narrows you down to a few hundred people. The difference is that it took time to do that. It took human hours. That was the moat. Researchers just drained it. In February 2026, a team from S. Zurich and Anthropic published a paper called Large Scale Online Denominization with LLMs. Six researchers, one very uncomfortable finding. They built a four-stage pipeline they named ESRC. Extract, Search, Reason, and Calibrate. Here's how it works. Not from what you said you were, from what you revealed without meaning to. Your approximate location, your job, the conferences you mentioned, the niche hobby you can't stop talking about, your writing patterns, spelling, regional slang, etc. The next one is search. It converts that profile into a mathematical representation and runs it against a database of real identities, LinkedIn, and other platforms looking for the closest match. The next one is reason. The LLM then reasons over the candidates, weighing evidence, narrowing the field. And the last one is calibrate. It assigns a confidence score, either it makes a call or it abstains. The results: 67% of Hacker News users were correctly matched to their real LinkedIn profiles from a pool of 89,000 candidates. The whole experiment cost less than$2,000.$1 to$4 per person. Previous automated methods for the same task, 0.1% recall. The LLM approach didn't improve on that. It obliterated it. They didn't stop at Hacker News. They tested it on Reddit movie communities. They tested it on anonymized interview transcripts, including responses from a survey Anthropic ran about how people use AI in daily life, working only from casual conversational answers. The system identified 7% of participants. That sounds low, but the researchers point this out. The system identified anyone at all from general conversation, no structured data, no overlapping fields, just words. And lead researcher Simon Lerman said something that should stick with you forever. LLMs are not discovering hidden secrets. They are automating what skilled investigators could already do manually. The capability was always there, what changed is the cost of deploying it. Every few years, someone publishes a denominization paper and the internet briefly panics and then moves on. So why is this one different? I'll give you three reasons. The first one is it works on freeform text. Every previous attack required structured data. Two databases with overlapping fields you could cross-reference. This one doesn't. It reads the way a human reads. It can work from a Reddit comment, an interview transcript, and a form post. The messiness of natural language used to be protection. Not anymore. The second one is the attack looks like normal usage. The pipeline is built from individually innocent steps, summarizing text, generating embeddings, ranking candidates, no single step looks malicious. That makes it almost impossible for platforms to detect or block automatically. You can't flag someone as reasoning about your identity the way you can flag a SQL injection. 3. Safety guardrails don't hold. The researchers tested commercial LLM safety features. The model sometimes declined, but small prompt changes bypass those refusals every time. And open source models have no guardrails at all. The researchers are explicit about who's at risk here. Journalists protecting sources, activists in authoritarian contexts, whistleblowers, domestic abuse survivors who use pseudonyms to stay safe, anyone who relies on the separation between their online and offline life for reasons that matter. But also, honestly, most of us. The average person has accounts they've kept separate for reasons that are personal, not political. That separation is now fragile in a way it wasn't six months ago. Here's what you can actually do about it. First, the honest answer. If you've been consistently posting on a pseudonymous account for years, mixing details from your real life, the protection you thought you had was already thinner than you realized. This research doesn't create a new threat, it lowers the cost of a threat that already existed. Here's what actually moves the needle, though. Stop cross-contaminating. The attack works by aggregating identity signals across platforms. If your pseudonymous Reddit account and your professional LinkedIn mention the same conference, the same niche software library, the same city, that's a signal. The fix isn't a new username, it's discipline about what you share and where. Treat your content as the identifier. Writing style, vocabulary, niche interests, biographical details, these are all functionally the same as your name. The researchers found that movie discussion alone was enough to identify users. The more specific and personal your posts, the easier you are to find. Purge old posts. Reddit has account deletion tools, so does Twitter or X. If an account has years of accumulated detail that you no longer want attached to you, cleaning it is now a meaningful privacy action. Not perfect, scrape data persists, but it limits ongoing exposure. Use separate browsers or profiles for separate identities. Not just different usernames, different browsing environments, so cross-platform behavioral fingerprints has less to work with. If real anonymity matters to you, compartmentalization and information hygiene, here's the researcher's own advice. Ask yourself whether a team or skilled investigators could figure out who you are from your posts. If yes, the AI can too, and it costs less than a meal out. The title of the research paper is called Large Scale Online Denonymization with LLMs. Six researchers, one number that keeps coming back, four dollars. That's the cost to unmask a person, not a corporation, not a surveillance state with unlimited resources, anyone with an API key, and a few bucks. What the researchers actually said is worth sitting with. The average online user has long operated under an implicit threat model where they assumed synonyminity provides adequate protection because targeted denonization would require extensive effort. LLMs invalidate this assumption. The cost is gone.