S7, E270 - The 40-Minute Hack That Stole the Blueprint for AI | The Mercor Breach Artwork

Privacy Please

Welcome to "Privacy Please," a podcast for anyone who wants to know more about data privacy and security. Join your hosts Cam and Gabe as they talk to experts, academics, authors, and activists to break down complex privacy topics in a way that's easy to understand.

In today's connected world, our personal information is constantly being collected, analyzed, and sometimes exploited. We believe everyone has a right to understand how their data is being used and what they can do to protect their privacy.

Please subscribe and help us reach more people!

This podcast is part of The Problem Lounge network — conversations about the problems shaping our world, from digital privacy to everyday life.

All Episodes

Privacy Please

S7, E270 - The 40-Minute Hack That Stole the Blueprint for AI | The Mercor Breach

April 20, 2026 • A Problem Lounge Show

0:00 | 13:16

Send us Fan Mail

A normal data breach steals names and passwords. This one may have stolen the recipe for building the world’s most powerful AI models, and it happened through software most people will never notice until it breaks. We follow the Mercor breach from the first warning signs to the moment poisoned Python packages hit PyPI and spread in minutes across systems that were set to auto-update.

We walk through what Mercor actually does in the AI economy, especially RLHF (Reinforcement Learning from Human Feedback), and why that behind-the-scenes work shapes how tools from OpenAI, Anthropic, Meta, and Google behave. Then we unpack Lite LLM, the open source “plumbing” that connects apps to multiple AI services, and how a supply chain attack can bypass the company you’re targeting by compromising the dependencies everyone trusts.

From there, the focus shifts to the fallout: contractors whose Social Security numbers and identity documents may be exposed, companies scrambling to assess backdoors and credential theft, and the bigger fear that proprietary AI training data sets and labeling strategies are being auctioned on the dark web. We also dig into the compliance controversy around SOC2 and ISO 27001 style certifications and what happens when security audits become performance instead of protection.

If you care about cybersecurity, data privacy, AI governance, and open source risk, listen through to the end for concrete steps you can take right now. Subscribe, share this with a friend who uses AI tools, and leave a review with your take on who should be held accountable.

Support the show

A Different Kind Of Data Breach

SPEAKER_00 0:00

I'm pretty sure you've heard about data breaches before. Right? A hospital gets hacked, a retailer exposes credit card numbers. Someone's social security number ends up on the dark web. This one is definitely different though. So three weeks ago, hackers didn't just steal personal data. They may have walked out with the actual blueprints for how the world's most powerful AI models are built. The secret sauce behind open AI, behind Meta, behind Anthropic. Yeah. Those guys. And they got in through a piece of software almost nobody has ever heard of that was downloaded 97 million times last month alone. This is the story of the Mercor breach. And if you've been listening to this show, you're in it. Let's go. Alrighty then, ladies and gentlemen, welcome back to another episode of Privacy Please. I am your host, Cameron Ivey, and today we're following the money, baby. The data. And a hacking group that just pulled off what might be the most consequential AI heist in American history or world history. Let's go ahead and dig in. But before we do, I must remind you the Privacy Please podcast is part of the Problem Lounge Network. So if you're not familiar with it, check it out. The problemlounge.com, our website for our show, The Problem Lounge, and for Privacy Please, the show you're listening to. Go give it some love. Go follow us. Go find our YouTube, the Problem Lounge Network. We're on there. Alright, let's jump in. Most people have never heard of Mercor. That's kind of the point. They were founded in 2023 by three friends who met on their high school debate team. They were 19 years old when they first started it. Mercor sits at one of the most critical and least visible junctions in the entire AI economy. Here's what it does: it recruits experts, doctors, lawyers, engineers, scientists, writers, and pays them to help train AI models. They read outputs, rank responses, flag errors, provide high-quality human feedback that makes the AI smarter. That process is called RLHF, Reinforcement Learning from Human Feedback. And it's the engine behind the every major AI system you've ever used. Mercur's clients, OpenAI, Anthropic, Meta, and Google. Those are pretty big names. By October 2025, the company was valued at$10 billion. Its three founders, all 22 years old, became the youngest self-made billionaires in the world. And nobody's ever heard of them. And on March 27, 2026, it became the center of one of the most consequential cybersecurity incidents in the history of artificial intelligence. Here's where it gets fascinating and alarming in equal measure. The hackers didn't attack Mercor directly. They didn't break down the front door, they poisoned the water supply, so to speak. Their target was a piece of software called Lite LLM. Most people outside of software development have never heard of it. But if you've used any AI tool in the last two years, there's a very good chance it was running underneath it. Lite LLM is an open source Python library that lets developers connect their applications to AI services, OpenAI, Anthropic, Google, all of them. Through a single unified interface. It's downloaded 97 million times a month. It's estimated to be present in 36% of all cloud environments. It is essentially the plumbing of the AI industry. A hacking group called Team PCP had been working their way upstream for weeks. They first compromised a tool called Trivi, a security scanner, and used that to steal the credentials of a maintainer at Lite LLM. Then on March 27th, they used those stolen credentials to publish two poisoned versions of Lite LLM directly to PyPy, which is PYPI, the main repository where developers download Python software. The malicious code was live for approximately 40 minutes. Now, in those 40 minutes, because thousands of systems are configured to automatically pull the latest version of software they depend on, the poisoned packages were downloaded and installed across thousands of companies. Poisoned packages. Sounds like a good 90s band name. Anyways, the malware harvested credentials established backdoor access and spread silently before anyone knew what was happening. A security researcher named Colum McCaham discovered the compromise at 1148 UTC when his own machine crashed. PYPI quarantined the package by 1338 UTC. So 40 minutes. 40 minutes. Thousands of victims. Merker was one of them. This is where the story stops being about software and starts being about people. The hacking group, Lapsis, a notorious extortion gang, with a track record of hitting major corporations. They claimed responsibility for the Merker breach specifically. They listed the stolen data for auction on the dark web. What they claim to have taken, four terabytes of data, that's roughly the equivalent of 4,000 hours of video or 4,000 copies of the encyclopedia. The contents, this is interesting. So 939 gigabytes of Merker's platform source code, the three terabytes of video and identity data, including interview recordings and social security numbers for over 40,000 contractors, internal Slack communications, ticketing data, and potentially the detail that's caused the most alarm in Silicon Valley, proprietary training data sets, and the labeling strategies developed specifically for Meta, OpenAI, and Anthropic. Let that last one sink in a little bit. Not just personal data, the actual methodologies used to build the world's most powerful AI models may now be in the hands of hackers who are actively auctioning them to the highest bidder. Y Combatinator CEO Gary Tan called it a major national security issue, noting that an incredible amount of state-of-the-art training data was now potentially available to adversaries, including state-sponsored labs in China. Meta responded by indefinitely suspending all work for Merkur. OpenAI said it's investigating, but hasn't paused. And 40,000 plus contractors, you know, real people who trusted Merker with their social security numbers, their identities, their work, woke up to find out their data was being auctioned on the dark web. Here's the layer of the story that nobody is talking about enough. Light LLM had security certifications, SOC2, ISO 27001, the industry standard stamps of approval that tell companies this software has been audited and it's safe to use. Those certifications were issued by a company called Delve Technologies. Delve was a hot startup founded in 2023, Forbes 30 and under 30 founders. Why Combatner Alumni,$32 million series A, 1,700 customers. They pitched a Gentec AI that could compress compliance certification timelines for months to days for as little as 6,000 to 15,000, a fraction of what traditional audits cost. On March 18th, nine days before the light LLM attack, an anonymous Substack account published a post titled Delve Fake Compliance as a Service. So here's the allegation: Delve wasn't actually auditing anything. They were generating compliance certificates using AI, rubber stamping software as secure without doing the real work, the certifications that were supposed to protect Light LLM, and by extension, every company that trusted it may have been theater. TechCrunch described the collision of these two stories as Silicon Valley's two biggest dramas intersecting. The company responsible for certifying your security was allegedly faking its own certifications, and the software it certified became the vector for one of the most significant supply chain attacks in AI history. You might be thinking, I've never heard of Merkur or Lite LLM. This doesn't affect me. But here's the thread that connects this episode to everything we've covered this season. In You're the Teacher Now, we talked about how companies are using your data, your work, your patterns, your decisions to train AI models without explicit consent. Merkur is exactly the infrastructure that process runs through. The experts it hired to label data, rank outputs, teach AI how to think. Those are real people whose identities are now on the dark web. And if you've used any AI-powered tool in the last two years, a chatbot, a coding assistant, a writing tool, there's a meaningful chance it was connected to services running light LLM. The plumbing was compromised. You just didn't know it. The broader point is this the AI industry is moving at a speed that its own security infrastructure can't keep up with. A 40-minute window was enough to compromise thousands of companies. A startup promising cheap compliance certificates created a systemic vulnerability across the entire ecosystem. And the companies building the most powerful technology in human history are all sharing the same fragile supply chain. Here's what you can actually do about it. If you're a contractor who worked with Merker, your social security number and identity documents may be compromised. Freeze your credit now at all three bureaus: Equifax, Xperian, TransUnion. It's free. Go do it. Takes 10 minutes. If you're a developer, audit your dependencies. Know what open source libraries your system are pulling automatically. Pin versions, not the latest. The attack worked because systems were configured to auto-update. If you're a regular person using AI tools, nothing actionable today, but just know this. The infrastructure underneath the tools you use is more fragile and more interconnected than anyone has publicly admitted. The Merker Breach, it's not going to be the last of its kind. There's a detail in this story I can't stop thinking about. The three founders of Merker, the company at the center of all of this, met on their high school debate team. They built a$10 billion company by the age of 22, and they're now at the center of what may be the most consequential data breach in the history of artificial intelligence. The attackers got in through 40 minutes of poisoned software. The compliance certifications that were supposed to prevent it were allegedly fake. And the data that was supposed to be teaching AI how to be smarter is now being auctioned on the dark web. This isn't a story about a hack. It's a story about how fast we built something and how much we assumed about how safe it was. I'm Cameron Ivy. This was Privacy Please, part of the Problem Lounge Network. Thank you so much for listening. If you're new, please subscribe. I hope you liked it. And we'll keep putting out more stuff like this each and every week. Share it with someone that uh you might think could uh benefit from this. And we'll see y'all in the next one. Cameron Ivy, over now,

Cameron Ivey

Host

Gabe Gumbs

Host