The Company Building AI to Build Itself Just Asked the World to Hit Pause Artwork

Yesterday in AI

A rundown of all of the important stories in AI that happened yesterday in 10 minutes or less.

Yesterday in AI

The Company Building AI to Build Itself Just Asked the World to Hit Pause

June 06, 2026 • Mike Robinson

0:00 | 9:11

Yesterday in AI | Saturday, June 6, 2026

The Company Building AI to Build Itself Just Asked the World to Hit Pause

Anthropic dropped a document this week that might be the most honest thing a frontier AI lab has ever published: internal metrics showing Claude now writes 80% of their production code, engineers ship 8x as much per day as 2024, and their flagship model sped up its own training code by 52x in a single year. Then they called for a global, verifiable mechanism to slow or stop AI development, right before their IPO.

That story, plus: someone leaked Anthropic's next unreleased model and sold it via a Chinese API proxy (the red-team program is now paused); OpenAI's new "Dreaming" feature builds a background profile of you while you sleep, doubling factual recall; SpaceX's IPO filing reveals a $920M/month Google cloud deal, nearly a billion dollars a month for compute; TSMC warns AI chip demand will outstrip supply for years; and Apple's WWDC drops Monday, with a rebuilt Siri running on Nvidia chips and Google's Gemini models under the hood.

Send us Fan Mail

Feedback? Email mike@yesterdayinai.news or connect on LinkedIn, X, or Bluesky. If you like the show, please take a minute to rate and review it so others can find it!

SPEAKER_00 0:05

Hi folks, this is Yesterday in AI, your daily digest of everything happening in the world of AI in ten minutes or less. I'm Mike Robinson. It's Saturday, June 6th, and the company building the most powerful AI on the planet just published a blog post saying it might be getting too powerful to control, while also being three days away from filing to go public. That's not irony. That's just Friday in AI. Let's get into it. We start with Anthropic, who published a post called When AI Builds Itself, and it contains some numbers that deserve to be read slowly. As of May 2026, more than 80% of the code merged into Anthropic's own codebase was written by Claude, not assisted by Claude, not co-written, authored. The average engineer at Anthropic is now shipping eight times as much code per day as they were in 2024. On their most open-ended internal coding tasks, not scripted demos but real messy research problems, Claude's success rate hit 76%. That's up 50 points in six months. Here's the one that really lands. Mythos Preview, their current flagship model, sped up Anthropic's own model training code by roughly 52 times. In May 2025, an earlier Claude model got about three times. In one year, three times to 52 times. What Anthropic is describing is called recursive self-improvement. The idea is that AI systems help build more capable AI systems, each generation faster than the last. In the post, co-author Jack Clark writes, each new version of Claude could be built by the version before it without human involvement. They're careful to say this isn't inevitable, that the loop is still incomplete, that humans still set direction, but they're also clearly saying, look at these numbers. The trajectory is real. And then comes the part that makes this post unusual for a company with a $965 billion valuation and a pending IPO. Anthropics says it would slow down or pause entirely if other Frontier labs verifiably did the same. They're calling for an arms control style international framework, agreed triggers, verification mechanisms, coordination across labs in multiple countries. They want to build the infrastructure for a potential pause before anyone needs to use it. Think about what it means when the company doing the most consequential AI research in the world publishes a document saying publicly, we might need to stop. That's not PR spin. They've got nothing to gain commercially from that message. The post dropped the same week their S1 confidential filing is under SEC review. Humans still have the judgment layer, direction setting, deciding which research paths to trust, knowing when an idea is a dead end. That's still us. For now. But the execution layer is already mostly clawed. And if the last year's trajectory holds, the distance between those two layers is shrinking faster than most institutions are prepared for. There's a related story that broke yesterday that Anthropic definitely did not want to be making. A red team participant in their pre-launch safety testing program leaked an unreleased Anthropic model, codenamed Oceanus, and sold access to it through a Chinese API proxy. Red Team programs are how AI labs check for dangerous capabilities before a model ships. A small group of vetted external testers get access under strict confidentiality. Someone in that group apparently thought X amount of money and API resale revenue was worth more than the agreement. The model's internal designation is Claude-Oceanus-V1-P. Based on what's been reported, it's a Mythos successor, more capable than Mythos Preview, the current public-facing version. Anthropic has now paused the Red Team program entirely, which means the safety vetting that has to happen before this model can launch publicly just got interrupted. Two things are true simultaneously here. This is a significant security failure, and it's also the latest proof that the demand for top-tier AI capabilities is so intense that people are willing to break agreements and risk legal consequences to get early access. A Frontier model got stolen and resold before it had even been announced. The launch timeline for Oceanus is now unclear. The leak itself is notable. But the bigger signal is what it says about how much the world wants what's coming. Shifting gears to OpenAI. They shipped a feature this week with a name that sounds like a wellness app and works more like something out of science fiction, Dreaming. Dreaming is OpenAI's new memory synthesis system for ChatGPT. Here's how it works. In the background, while you're not actively using the app, ChatGPT processes your past conversations and builds a continuously updated profile of you. Categories like travel preferences, work context, hobbies, communication style. The system organizes this into a running summary you can review, edit, or delete at any time. The results from OpenAI's own evaluations are striking. Factual recall jumped from 41.5% to 82.8%. Preference following, how well ChatGPT actually acts on what it knows about you, went from 31.4% to 71.3%. It's rolling out to plus and pro users in the US first. The practical effect is that ChatGPT gets meaningfully smarter about you every time you use it. The explicit trade-off is obvious. You're giving OpenAI a detailed, persistent, AI-interpreted record of your interests, habits, and preferences. The privacy controls are there if you want them. But the default is dream on. There's a business logic here that's easy to read. The most powerful switching cost in consumer tech isn't price, it's personalization. Every interaction that makes ChatGPT more useful because it knows you better is a data point that you'd have to give up if you switch to a competitor. Sam Altman has been talking about deeply personal AI for two years. Dreaming is how you build that, one background synthesis at a time. Let's talk about the infrastructure layer because two data points this week tell you everything you need to know about what AI actually costs at the serious end of the market. First, SpaceX's IPO filing with the SEC disclosed a multi-year cloud services deal with Google. The numbers are not a typo. Google pays SpaceX $920 million per month for compute capacity, starting October 2026 through June 2029. That's roughly $33 billion in total commitments over the life of the contract. $920 million a month for compute, one contract. Companies are now locking in near billion dollar monthly compute commitments years in advance because the alternative is not having capacity when you need it. Second data point TSMC, the Taiwanese foundry that makes chips inside essentially every AI system on the planet, warned this week that AI chip demand will outstrip supply for years, not quarters. Years. Put those two together and you get a clear picture. The race for AI capability is real. The bottleneck is physical, and the companies that locked in capacity early are going to have a structural advantage over companies that didn't. The $920 million a month deal isn't a bill SpaceX is dreading. It's an asset. Finally, a preview. Monday is Apple's WWDC, their annual developer conference, and the leaks this week filled in details that are worth knowing going in. The rebuilt Siri is coming. After roughly two years of delays, Apple is set to show the world a Siri that actually works at the level people have been promised. Here's what the pre-conference reporting says about how it's built. Siri will run on Nvidia's Blackwell chips. Cloud AI queries will be routed through Google's Gemini models. Let that sit for a second. Apple is rebuilding its signature AI assistant on a competitor's silicon and a competitor's model. NVIDIA makes the chips, Google runs the cloud inference. Apple provides the device, the interface, and the brand. This is what happens when you're three years late to a technology shift. You don't build everything from scratch, you integrate the best available infrastructure and figure out differentiation in the experience layer. Whether that strategy works is exactly what WWDC is going to start answering. iOS 27, a standalone Siri app, system-wide AI controls, a new App Store section for AI extensions, all of it reportedly dropping Monday. This is the most consequential Apple product reveal in years. For anyone who cares about where AI ends up in the daily life of two billion iPhone users, Monday matters. Just a couple of more items. If you have any feedback about this show, you can email Mike at yesterdayNai.news, or you can find me on LinkedIn, X or Blue Sky. And if you like this podcast and want to see it continue, please take a minute to rate and review it so others can find it. Thanks. That's all for this edition of Yesterday and AI. Stay curious, have a great weekend, and I'll see you on Monday.

Mike Robinson

Host